n8n Batch Processing: Handle Large Datasets Without Crashing
n8n Batch Processing: Handle Large Datasets Without Crashing
• Logic Workflow Team

n8n Batch Processing: Handle Large Datasets Without Crashing

#n8n #batch processing #performance #Loop Over Items #optimization #tutorial

Your n8n workflow works perfectly with 10 items. Run it with 1,000 and watch it crash.

This happens constantly. You build an automation that handles a handful of records flawlessly. Then real production data arrives: thousands of customer records, massive CSV imports, or API responses with hundreds of items. Suddenly you’re staring at “JavaScript heap out of memory” errors, frozen workflows, or API rate limit messages.

The problem isn’t n8n. The problem is how you’re processing data.

The Breaking Point

Most n8n workflows process items one by one or let nodes handle arrays automatically. This works fine for small datasets. But as data volume grows, you hit walls:

  • Memory exhaustion: n8n holds all items in memory during execution. 10,000 items with complex JSON structures can consume gigabytes of RAM.
  • API throttling: Firing 500 API calls in rapid succession triggers rate limits, resulting in 429 errors and failed workflows.
  • Timeout failures: Long-running operations exceed webhook timeouts or execution time limits.
  • Editor freezing: Large workflows with many items become unresponsive in the n8n interface.

What You’ll Learn

  • How n8n processes data internally and why large datasets cause crashes
  • Using the Loop Over Items node to process data in controlled batches
  • Choosing optimal batch sizes for different workloads
  • Rate limiting patterns with Wait nodes to respect API limits
  • Memory management strategies to prevent heap exhaustion
  • State tracking across batches with workflowStaticData
  • Advanced patterns for parallel processing and error isolation
  • Real-world examples with production-ready configurations

Understanding n8n’s Data Flow

Before fixing batch processing problems, you need to understand how n8n handles data.

How Items Flow Through Nodes

Every n8n workflow operates on items. An item is a JSON object containing your data. When a trigger fires or a node executes, it produces one or more items that flow to the next node.

Here’s the key insight: many n8n nodes automatically process all incoming items. The HTTP Request node, for example, executes once per item it receives. Pass it 100 items, and it makes 100 HTTP requests.

This automatic iteration is convenient but dangerous. You have no control over timing, memory usage, or error handling for individual items.

When Automatic Processing Fails

Automatic item processing works well when:

  • You have fewer than 100 items
  • Each item requires minimal processing
  • External APIs have generous rate limits
  • Memory isn’t a concern

It fails when:

  • Datasets contain thousands of items
  • API endpoints have strict rate limits (common with email, CRM, and social media APIs)
  • Items contain large payloads (binary files, nested JSON)
  • You need progress tracking or partial failure recovery

The Memory Model

n8n runs on Node.js, which has a default heap size limit (typically 512MB to 1.7GB depending on your system). Every item in your workflow consumes memory. When processing large datasets:

  1. The trigger loads all items into memory
  2. Each node transformation creates new objects
  3. Results accumulate until execution completes
  4. Garbage collection can’t reclaim memory fast enough

The result: your n8n instance crashes with FATAL ERROR: CALL_AND_RETRY_LAST Allocation failed - JavaScript heap out of memory.

The Loop Over Items Node

The Loop Over Items node (previously called “Split in Batches”) is n8n’s answer to batch processing. It divides your dataset into smaller chunks and processes them sequentially.

How It Works

The Loop Over Items node:

  1. Receives all items from the previous node
  2. Outputs a batch of items (the size you specify)
  3. Waits for downstream processing to complete
  4. Outputs the next batch
  5. Repeats until all items are processed
  6. Signals completion through its “done” output

This sequential processing gives you control over memory usage, timing, and error handling.

Node Configuration

Batch Size

The number of items to process in each iteration. Default is 1.

Batch Size: 10

Lower values reduce memory pressure but increase total execution time. Higher values process faster but consume more resources.

Options: Reset

Controls whether to clear the loop’s internal state when the workflow completes. Leave enabled unless you have a specific reason to preserve state across executions.

Understanding the Two Outputs

The Loop Over Items node has two outputs:

OutputWhen It FiresUse Case
Loop (first output)For each batch of itemsConnect your processing nodes here
Done (second output)After all batches completeConnect final actions (summary, notification)

Important: The loop output must connect back to the Loop Over Items node to continue processing. Forgetting this connection causes the workflow to stop after the first batch.

Basic Workflow Structure

Trigger → Get Data → Loop Over Items → Process Items → (back to Loop Over Items)
                            ↓
                    (When done) → Send Summary

Complete Workflow Example

Here’s a workflow that processes items in batches of 10:

{
  "nodes": [
    {
      "parameters": {},
      "name": "Manual Trigger",
      "type": "n8n-nodes-base.manualTrigger",
      "position": [250, 300]
    },
    {
      "parameters": {
        "batchSize": 10,
        "options": {}
      },
      "name": "Loop Over Items",
      "type": "n8n-nodes-base.splitInBatches",
      "position": [450, 300]
    },
    {
      "parameters": {
        "url": "https://api.example.com/process",
        "method": "POST",
        "body": "={{ $json }}"
      },
      "name": "HTTP Request",
      "type": "n8n-nodes-base.httpRequest",
      "position": [650, 300]
    }
  ],
  "connections": {
    "Manual Trigger": {
      "main": [[{"node": "Loop Over Items", "type": "main", "index": 0}]]
    },
    "Loop Over Items": {
      "main": [
        null,
        [{"node": "HTTP Request", "type": "main", "index": 0}]
      ]
    },
    "HTTP Request": {
      "main": [[{"node": "Loop Over Items", "type": "main", "index": 0}]]
    }
  }
}

Notice how the HTTP Request connects back to Loop Over Items. This creates the loop that processes all batches.

Tracking Loop Progress

You can access loop metadata in expressions:

// Current batch index (0-based)
{{ $node["Loop Over Items"].context["currentRunIndex"] }}

// Check if all items processed
{{ $node["Loop Over Items"].context["noItemsLeft"] }}

This is useful for progress logging or conditional logic based on batch position.

Choosing the Right Batch Size

Batch size significantly impacts performance, reliability, and resource usage. There’s no universal “best” value.

Factors That Influence Batch Size

1. API Rate Limits

If your external API allows 100 requests per minute, calculate:

Batch Size × Batches Per Minute ≤ Rate Limit

With a 10-second wait between batches (6 batches/minute), you can safely use batch size 15.

2. Payload Size

Large JSON payloads or binary data consume more memory per item. Reduce batch size when processing:

  • Images or PDFs
  • Nested JSON structures with many fields
  • Base64-encoded content

3. Processing Time

If each item takes 2 seconds to process, a batch of 50 items takes nearly 2 minutes. Consider whether you need faster feedback or can tolerate longer batch times.

4. Error Tolerance

Smaller batches isolate failures better. If a batch of 100 fails, you lose 100 items. A batch of 10 loses only 10.

Batch Size Recommendations

Use CaseRecommended SizeReason
API calls with rate limits10-25Stay under rate limits with Wait nodes
Database operations50-100Databases handle bulk operations well
Email sending5-10Email APIs have strict limits
File processing1-5Large memory footprint per item
Lightweight transformations100-200Minimal resource usage
External webhooks10-20Avoid overwhelming recipients

Testing Methodology

  1. Start small: Begin with batch size 10
  2. Monitor memory: Watch n8n’s memory usage during execution
  3. Check rate limits: Verify no 429 errors occur
  4. Measure time: Record total execution time
  5. Increase gradually: Double batch size and repeat
  6. Find the ceiling: Stop when you see errors or degradation

For production workflows processing thousands of items, testing with realistic data volumes is essential before deployment.

Rate Limiting Patterns

Most APIs restrict request frequency. Batch processing alone doesn’t solve rate limiting. You need controlled delays between batches.

Adding Wait Nodes

The Wait node pauses execution for a specified duration. Insert it between your processing node and the loop return:

Loop Over Items → HTTP Request → Wait (10 seconds) → Loop Over Items

Wait Node Configuration:

Wait Time: 10
Unit: Seconds

Calculating Delay Times

Formula:

Delay = (60 Ă· Requests Per Minute Limit) Ă— Batch Size

Example: Gmail API (250 emails/day = ~10/hour)

Delay = (3600 Ă· 10) Ă— batch_size
Delay = 360 seconds Ă— batch_size

For batch size 5: 1,800 seconds (30 minutes) between batches.

Complete Rate-Limited Workflow

{
  "nodes": [
    {
      "parameters": {},
      "name": "Manual Trigger",
      "type": "n8n-nodes-base.manualTrigger",
      "position": [250, 300]
    },
    {
      "parameters": {
        "batchSize": 5,
        "options": {}
      },
      "name": "Loop Over Items",
      "type": "n8n-nodes-base.splitInBatches",
      "position": [450, 300]
    },
    {
      "parameters": {
        "resource": "message",
        "operation": "send"
      },
      "name": "Gmail",
      "type": "n8n-nodes-base.gmail",
      "position": [650, 300]
    },
    {
      "parameters": {
        "amount": 60,
        "unit": "seconds"
      },
      "name": "Wait",
      "type": "n8n-nodes-base.wait",
      "position": [850, 300]
    }
  ],
  "connections": {
    "Manual Trigger": {
      "main": [[{"node": "Loop Over Items", "type": "main", "index": 0}]]
    },
    "Loop Over Items": {
      "main": [
        null,
        [{"node": "Gmail", "type": "main", "index": 0}]
      ]
    },
    "Gmail": {
      "main": [[{"node": "Wait", "type": "main", "index": 0}]]
    },
    "Wait": {
      "main": [[{"node": "Loop Over Items", "type": "main", "index": 0}]]
    }
  }
}

For more comprehensive rate limiting strategies including exponential backoff, see our API rate limits guide.

Handling 429 Responses

Even with delays, occasional rate limit errors occur. Configure the HTTP Request node to retry:

  1. Open Settings in the HTTP Request node
  2. Enable Retry On Fail
  3. Set Max Tries: 3
  4. Set Wait Between Tries: 5000ms

This handles transient rate limits without failing the entire workflow.

Memory Management Strategies

When processing tens of thousands of items, memory management becomes critical. n8n workflows that work with 1,000 items might crash with 50,000.

Increasing Node.js Heap Size

By default, Node.js limits heap memory. Increase it with the NODE_OPTIONS environment variable:

# Allow 4GB heap memory
NODE_OPTIONS=--max-old-space-size=4096

For Docker deployments:

environment:
  - NODE_OPTIONS=--max-old-space-size=4096

Warning: This raises the ceiling but doesn’t solve the underlying problem. You’re buying time, not fixing architecture.

Reducing Data Between Nodes

Each node receives the full output of previous nodes. Strip unnecessary fields early:

// In a Code node, keep only needed fields
return items.map(item => ({
  json: {
    id: item.json.id,
    email: item.json.email
    // Omit large fields like 'fullPayload', 'attachments'
  }
}));

Using Sub-Workflows for Memory Isolation

Sub-workflows execute in isolated contexts. When a sub-workflow completes, its memory is released.

Pattern:

Main Workflow: Trigger → Loop Over Items → Execute Workflow (sub) → Loop
Sub-Workflow: Webhook → Heavy Processing → Respond to Webhook

Each batch processes in the sub-workflow’s memory space, then releases. The main workflow’s memory stays constant.

When to Use Queue Mode

For very large datasets (50,000+ items) or mission-critical processing, consider n8n queue mode. Queue mode:

  • Distributes execution across multiple worker processes
  • Handles failures without losing entire workflows
  • Scales horizontally with infrastructure

Queue mode requires Redis and PostgreSQL but solves memory and reliability problems at scale.

Binary Data Considerations

Binary data (images, PDFs, files) consumes far more memory than JSON. When batch processing binary files:

  1. Use external storage: Store files in S3/MinIO, pass URLs instead of file content
  2. Process one at a time: Set batch size to 1 for heavy file operations
  3. Clean up immediately: Delete temporary files after processing

For queue mode deployments, configure S3 binary data storage so workers can share files:

N8N_DEFAULT_BINARY_DATA_MODE=s3
N8N_EXTERNAL_STORAGE_S3_BUCKET_NAME=your-bucket

State Management with workflowStaticData

Batch processing often requires tracking state across iterations: which items succeeded, cumulative counts, or data needed for the final batch.

What is workflowStaticData?

The workflowStaticData object persists throughout a single workflow execution. It’s accessible in Code nodes and survives across loop iterations.

// Access static data
const staticData = $getWorkflowStaticData('global');

// Set a value
staticData.processedCount = (staticData.processedCount || 0) + items.length;

// Read the value later
console.log(staticData.processedCount);

Tracking Progress

Count processed items across all batches:

const staticData = $getWorkflowStaticData('global');

// Initialize on first batch
if (!staticData.totalProcessed) {
  staticData.totalProcessed = 0;
  staticData.failed = [];
  staticData.startTime = new Date().toISOString();
}

// Update counts
staticData.totalProcessed += items.length;

return items;

In your final “done” branch, read the totals:

const staticData = $getWorkflowStaticData('global');

return [{
  json: {
    totalProcessed: staticData.totalProcessed,
    failedItems: staticData.failed.length,
    startTime: staticData.startTime,
    endTime: new Date().toISOString()
  }
}];

Distinguishing Creates vs Updates

When syncing data, you often need to create new records or update existing ones:

const staticData = $getWorkflowStaticData('global');

// Initialize lookup map
if (!staticData.existingIds) {
  staticData.existingIds = new Set(existingRecords.map(r => r.id));
}

// Categorize current batch
const toCreate = [];
const toUpdate = [];

for (const item of items) {
  if (staticData.existingIds.has(item.json.id)) {
    toUpdate.push(item);
  } else {
    toCreate.push(item);
    staticData.existingIds.add(item.json.id);
  }
}

Manual Checkpoint Patterns

n8n doesn’t support automatic resume from failures. But you can implement manual checkpointing:

const staticData = $getWorkflowStaticData('global');

// Track last successfully processed ID
staticData.lastProcessedId = items[items.length - 1].json.id;

// On workflow restart (in trigger logic), skip already-processed items
const startFromId = staticData.lastProcessedId || 0;

This requires storing state externally (database, file) between workflow executions for true resume capability.

Advanced Patterns

Beyond basic batching, several patterns solve complex processing requirements.

Parallel Batch Processing

Process multiple batches simultaneously using sub-workflows:

Trigger → Split Data Into Chunks (Code) → Execute Workflow (parallel) → Aggregate Results

The Code node splits data into chunks:

const chunkSize = 1000;
const chunks = [];

for (let i = 0; i < items.length; i += chunkSize) {
  chunks.push({
    json: {
      chunkIndex: Math.floor(i / chunkSize),
      items: items.slice(i, i + chunkSize).map(item => item.json)
    }
  });
}

return chunks;

Each chunk executes as a separate sub-workflow. Configure the Execute Workflow node with Mode: Each Item for parallel execution.

Caution: Parallel processing multiplies API calls. Ensure your rate limits can handle concurrent requests from all chunks.

Error Isolation with Continue On Fail

Prevent one failed item from stopping the entire batch:

  1. On your processing node, enable Settings → Continue On Fail
  2. Add an IF node after to check for errors
  3. Route errors to a logging/retry path
Loop Over Items → HTTP Request (continue on fail) → IF (has error?)
                                                        ↓ Yes: Log Error
                                                        ↓ No: Continue

Check for errors:

{{ $json.error ? true : false }}

Progress Notifications

Send periodic updates during long-running batch jobs:

const staticData = $getWorkflowStaticData('global');
const batchIndex = $node["Loop Over Items"].context["currentRunIndex"];

// Notify every 10 batches
if (batchIndex % 10 === 0 && batchIndex > 0) {
  // This item triggers notification
  return [{
    json: {
      notify: true,
      progress: `Processed ${staticData.totalProcessed} items`
    }
  }];
}

return []; // No notification for other batches

Route notification items to Slack, email, or webhook nodes.

Combining with Aggregate Node

After batch processing, the Aggregate node combines results from all iterations:

Loop Over Items → Process → Loop → (Done) → Aggregate → Final Output

Aggregate settings:

  • Fields to Aggregate: Select fields to combine
  • Aggregate: Choose “All Item Data” or specific fields
  • Output: Single item containing all results

This is useful for generating summary reports after processing.

Troubleshooting Common Issues

Batch processing introduces unique failure modes. Here’s how to diagnose and fix them.

Problem-Solution Reference

ProblemLikely CauseSolution
”Heap out of memory” errorDataset too large for available RAMReduce batch size, increase NODE_OPTIONS, use sub-workflows
Loop freezes on large datasetsInternal serialization bottleneck (45k+ items)Split data before Loop Over Items, use pagination
Batches stop before completionMissing loop connectionEnsure processing node connects back to Loop Over Items
429 rate limit errorsAPI throttlingAdd Wait node, increase delay, reduce batch size
Infinite loop crashLoop connected incorrectlyCheck connections, ensure “done” output isn’t looped
Inconsistent batch sizesDynamic data during executionPre-fetch all data before Loop Over Items
Memory grows with each batchItems accumulating in workflowUse sub-workflows, strip unnecessary fields

Debugging Large Dataset Freezes

If Loop Over Items freezes with large datasets (40,000+ items), the node’s internal serialization may be the bottleneck. Workarounds:

1. Pre-split in Code Node

const batchSize = 100;
const batches = [];

for (let i = 0; i < items.length; i += batchSize) {
  batches.push({
    json: {
      batchIndex: Math.floor(i / batchSize),
      items: items.slice(i, i + batchSize)
    }
  });
}

return batches;

Then process each batch object instead of using Loop Over Items.

2. Paginated Data Fetching

Instead of loading all items, fetch pages from your data source and process each page as a batch. This keeps memory constant regardless of total dataset size.

Monitoring Batch Progress

Without built-in metrics, add manual logging:

const batchIndex = $node["Loop Over Items"].context["currentRunIndex"];
const itemCount = items.length;

console.log(`Processing batch ${batchIndex}, ${itemCount} items`);

// Timestamps for duration tracking
if (batchIndex === 0) {
  $getWorkflowStaticData('global').startTime = Date.now();
}

return items;

Check logs via your n8n hosting platform or Docker logs.

For persistent debugging assistance, try our free workflow debugger tool.

Real-World Examples

These examples demonstrate batch processing patterns for common scenarios.

Example 1: CRM Contact Sync (10,000 Contacts)

Scenario: Sync contacts from a CSV export to HubSpot.

Configuration:

  • Batch size: 50 (HubSpot allows 100/10sec, we stay conservative)
  • Wait time: 2 seconds between batches
  • Error handling: Continue on fail, log errors

Workflow Structure:

Read CSV → Loop Over Items (50) → HubSpot Create/Update → Wait (2s) → Loop
                                         ↓ (on error)
                                    Log to Google Sheets

Key Code (Pre-processing):

// Normalize CSV data for HubSpot format
return items.map(item => ({
  json: {
    email: item.json.Email?.toLowerCase().trim(),
    firstname: item.json['First Name'],
    lastname: item.json['Last Name'],
    company: item.json.Company
  }
})).filter(item => item.json.email); // Skip rows without email

Example 2: Email Campaign with Gmail

Scenario: Send personalized emails to 500 recipients without hitting Gmail limits.

Configuration:

  • Batch size: 5
  • Wait time: 60 seconds (Gmail daily limit: ~500 for regular accounts)
  • Estimated runtime: ~100 minutes for 500 emails

Rate Limit Math:

500 emails Ă· 5 per batch = 100 batches
100 batches Ă— 60 seconds = 6,000 seconds = ~100 minutes

Workflow Structure:

Get Recipients → Loop Over Items (5) → Gmail Send → Wait (60s) → Loop
                        ↓ (done)
                  Send Summary Email

Example 3: Large CSV Processing

Scenario: Process a 100,000-row CSV file, transform data, and insert into PostgreSQL.

Configuration:

  • Pre-split into 1,000-row chunks
  • Process each chunk via sub-workflow
  • Sub-workflow batch size: 100 for database inserts

Main Workflow:

// Split large CSV into manageable chunks
const fs = require('fs');
const chunkSize = 1000;
const allData = items;
const chunks = [];

for (let i = 0; i < allData.length; i += chunkSize) {
  chunks.push({
    json: {
      chunkId: Math.floor(i / chunkSize),
      data: allData.slice(i, i + chunkSize)
    }
  });
}

return chunks;

Sub-Workflow:

Webhook (receives chunk) → Loop Over Items (100) → Postgres Insert → Loop → Respond

This architecture keeps memory bounded regardless of CSV size and provides natural checkpoints at chunk boundaries.

Frequently Asked Questions

What’s the difference between Loop Over Items and letting nodes process arrays automatically?

Most n8n nodes iterate over all incoming items automatically. If you send 100 items to an HTTP Request node, it makes 100 requests. The difference is control.

Loop Over Items gives you:

  • Controlled batch sizes (process 10 at a time instead of all 100)
  • Ability to add delays between batches
  • Progress tracking across iterations
  • Memory management by limiting concurrent processing

Use automatic processing for small datasets or when order and timing don’t matter. Use Loop Over Items when you need control over how and when items are processed.

How do I prevent memory errors when processing 50,000+ items?

Multiple strategies work together:

  1. Never load all items at once. Paginate your data source and process pages sequentially.
  2. Use small batch sizes (10-50 items) to limit active memory.
  3. Strip unnecessary data early in the workflow.
  4. Use sub-workflows to isolate memory for heavy processing.
  5. Increase heap size with NODE_OPTIONS=--max-old-space-size=4096 as a stopgap.
  6. Consider queue mode for reliable processing at scale.

The architectural answer is: don’t hold 50,000 items in memory. Fetch, process, and discard in batches.

Can I resume a batch workflow from where it failed?

n8n doesn’t natively support resume from failure. If a workflow fails mid-batch, it restarts from the beginning.

Workarounds:

  1. Track progress externally: Store the last processed ID in a database or file. On restart, query from that ID.
  2. Idempotent operations: Design operations so reprocessing doesn’t create duplicates (use upserts, check existence first).
  3. Checkpoint to database: After each batch, record progress. On failure, query the checkpoint and skip completed items.

For mission-critical workflows, queue mode provides better failure recovery.

What batch size should I use for API calls with rate limits?

Calculate based on the API’s stated limits:

Safe Batch Size = (Rate Limit Ă· Batches Per Minute) Ă— Safety Factor

Example: API allows 60 requests per minute. With 10-second waits between batches (6 batches per minute):

Batch Size = (60 Ă· 6) Ă— 0.8 = 8 items per batch

The 0.8 safety factor accounts for timing variations. Start conservative and increase if stable.

Common API limits and recommended batch sizes:

APITypical LimitSuggested BatchWait Time
Gmail250/day560s+
HubSpot100/10sec505s
Stripe100/sec501s
Slack1/sec (certain APIs)11s

Why does my Loop Over Items node freeze with large datasets?

The Loop Over Items node serializes all items internally before processing. With datasets exceeding 40,000-50,000 items, this serialization becomes a bottleneck, causing the node to appear frozen.

Solutions:

  1. Paginate at the source: Fetch data in pages of 5,000 items. Process each page as a separate workflow execution or sub-workflow.

  2. Pre-chunk with Code node: Split items into smaller arrays before Loop Over Items sees them.

  3. Use Execute Workflow for chunks: Split data into chunks, then call a sub-workflow for each chunk. The sub-workflow uses its own Loop Over Items with a manageable dataset.

This is a known limitation with very large datasets. The workaround is architectural: don’t pass 50,000 items to a single Loop Over Items node.


Next Steps

Batch processing transforms n8n from a tool for simple automations into a platform capable of handling production workloads. Start with small batches, add appropriate delays, and monitor your workflows under real conditions.

For complex data processing requirements, our workflow development services can architect solutions tailored to your specific needs. If you’re hitting scaling limits, our n8n consulting helps optimize existing workflows for performance.

Check out our n8n best practices guide for additional patterns that complement batch processing.

Ready to Automate Your Business?

Tell us what you need automated. We'll build it, test it, and deploy it fast.

âś“ 48-72 Hour Turnaround
âś“ Production Ready
âś“ Free Consultation
⚡

Create Your Free Account

Sign up once, use all tools free forever. We require accounts to prevent abuse and keep our tools running for everyone.

or

You're in!

Check your email for next steps.

By signing up, you agree to our Terms of Service and Privacy Policy. No spam, unsubscribe anytime.

🚀

Get Expert Help

Add your email and one of our n8n experts will reach out to help with your automation needs.

or

We'll be in touch!

One of our experts will reach out soon.

By submitting, you agree to our Terms of Service and Privacy Policy. No spam, unsubscribe anytime.