Intelligent data pipeline for Elastic

Built for the Elastic Stack ECS-compliant by default, Elasticsearch-native, and validated against Elasticsearch, Elastic Cloud, and Elastic Security workflows.

ECS compliant Elasticsearch native Elastic Security ready

Key Capabilities

Why teams choose DataStream for Elastic

A purpose-built pipeline that unlocks the full value of Elastic – without the infrastructure and storage costs that usually come with comprehensive log coverage.

Wide source coverage

On-premises, cloud, legacy systems, OT/ICS networks, IoT devices, and custom applications — all widely used sources are supported out of the box with no custom Filebeat plugin or Beats configuration required.

ECS normalization engine

Automated mapping and validation against Elastic Common Schema. Every event is enriched with the correct ECS fields so Kibana dashboards, detection rules, and analytics work on day one.

Data ingest cost optimization

Filter, deduplicate, sample, and route at the edge before data hits Elasticsearch. Typical customers reduce their daily ingest volume by 50–90% while expanding security coverage.

Multi-stage routing architecture

Intelligent routing sends security-relevant data to Elasticsearch, full data to Elasticsearch storage tiers, raw data as JSON or Parquet to cloud storage via dedicated cloud integrations, and supports Elasticsearch-native features – all from a single pipeline.

Schema drift detection

Automated validation prevents schema changes from breaking detection rules, Kibana visualizations, or compliance reports. Proactive alerting on data quality issues before they reach Elasticsearch.

Elastic Security ready

ECS-compliant output ensures Elastic Security detection rules, alerts, and investigation workflows function optimally – with no manual field extraction or schema mapping required.

ARCHITECTURE

How DataStream works

A multi-stage pipeline that processes raw logs before they reach your Elastic environment: each step improves data quality and reduces infrastructure cost.

Ingest

Logs from any source via Elastic Agents, Beats (Filebeat, etc.), agentless collection (WinRM/SSH), Syslog, CEF, LEEF, HTTP, and direct APIs

Parse & normalize

ECS-aware parsers and field mapping: Windows Events, Syslog, JSON, custom formats. Uses native Elasticsearch Ingest Pipeline format for all transformations

Enrich & validate

Contextual metadata enrichment + schema validation against Elastic Common Schema (ECS) requirements

Filter & optimize

Deduplication, sampling, and field extraction reduce volume by 50–90%

Route & deliver

Multi-stage routing: ECS-normalized security events → Elastic Security, full data → Elasticsearch, raw data → cloud storage (JSON or Parquet)

API & authentication

Elasticsearch Bulk API with native protocol support
API key and basic authentication with TLS
Elasticsearch Ingest Pipeline execution with error handling
High-throughput batching with retry logic
Rate limiting & throttle prevention built-in

ECS components

Pre-built ECS field definitions and aliases for common security data sources
Field mapping aligned to ECS data structures
Anomaly detection on schema drift
Compatible with Elastic Security detection rules
Optimized for Kibana visualizations and dashboards

Deployment options

Docker / Kubernetes container deployment
On-premises agent deployment
Cloud-native (AWS, Azure, GCP, Elastic Cloud)
Air-gapped & data-residency configurations
Multi-tenant MSSP configurations

Built for enterprise challenges

Elasticsearch ingest cost reduction

Elasticsearch and Elastic Cloud pricing scales with data volume and retention. DataStream acts as an intelligent optimization layer that reduces ingest volume by 50–90% through ECS-aware filtering, deduplication, and field extraction, without creating security blind spots or breaking existing Kibana dashboards and detection rules.

Legacy & OT integration

Legacy systems, OT networks, IoT devices, and custom applications often lack native Elastic Agents or Beats. DataStream provides ready-made connectors for all widely used sources and flexible transformation pipelines that eliminate custom development and reduce deployment complexity.

Elastic Security activation

Elastic Security’s detection rules and investigation workflows rely on ECS-compliant data to function optimally. DataStream’s automated ECS mapping ensures all ingested data is immediately available for security alerting, threat detection, and investigation without manual field extraction or schema mapping.

MSSP customer onboarding

MSSPs and security integrators can onboard new customers to Elasticsearch and Elastic Cloud faster with reduced integration complexity. Pre-built connectors and automated ECS normalization reduce custom development effort and accelerate time-to-deployment.

ROI & SAVINGS

Measurable impact on your Elastic infrastructure

With DataStream, you can extend coverage to more sources without increasing your Elastic infrastructure costs.

Data ingest volume before indexing

Without DataStream 100%

With DataStream 50–90%

Field-level optimization removes empty values, null fields, and operational metadata for an immediate 55–60% reduction. Optional event-level filtering and sampling push total reduction to 70–90% – with security-critical events always retained.

Deployment effort reduction

Custom integration development ~3 months

With DataStream A few days

Pre-built connectors and automated ECS normalization eliminate custom development work and accelerate time-to-value.

Comparison

VirtualMetric vs. Alternatives

How does DataStream compare to other data pipeline solutions for Elastic Stack?

	Cribl Stream	Logstash	Native Beats
Native Elasticsearch integration
Automated ECS normalization	Manual	Plugin-based	Per Beat
Elastic Security ready	Generic	Manual	Beat-dependent
OT / Legacy / IoT connectors	Add-ons required	Custom config	Limited
Multi-stage routing	Basic	Pipeline only
Data ingest cost reduction	Generic
Raw data – cloud storage (Parquet)		Manual
Schema drift detection

“VirtualMetric combines deep technical know-how with clear market focus and sharp execution. The team is ISO27001 and SOC2 certified and perfectly positioned to lead the European market in Security Data Management.“

William Lecat

Partner at Auriga Cyber Ventures

“VirtualMetric DataStream enables us to increase our quality of service by removing a lot of manual processing and providing better options to our customers for log ingestion.“

Maarten Goet

Chief Technology Officer at Wortell

“Through mutual respect, dedication, and a willingness to adapt and innovate, they successfully transformed a looming crisis into an opportunity for growth and innovation.“

Mehmet Susuz

IT Associate Director at Turkcell Communication Services

Frequently asked questions

How can I reduce Elasticsearch ingest costs without losing security visibility?

Elasticsearch pricing scales with data volume stored and indexed. DataStream reduces that volume through a layered approach. By default, field-level optimization removes empty values, null fields, and operational metadata that Elasticsearch analytics rules never reference, achieving 55-60% reduction with no security risk. Optional event-level filtering and statistical sampling can push total reduction to 70-90%, with security-critical events always protected. Full raw logs are simultaneously routed to low-cost cloud storage (AWS S3, Azure Blob, Google Cloud Storage) with a Correlation ID, so analysts can retrieve complete records for forensic investigations when needed.

Why do I need a pipeline tool if Elasticsearch already has Ingest Pipelines and Beats?

Elasticsearch Ingest Pipelines and Beats handle ingestion and basic transformations within the Elastic Stack, but they have significant limitations. Beats can’t collect from agentless sources or systems without agent installation, Ingest Pipelines don’t normalize data to ECS across diverse source types automatically, and both send everything to a single Elasticsearch cluster. DataStream operates before data reaches Elasticsearch: it collects from any source via agentless collection (WinRM/SSH) or agents where needed, applies vendor-specific ECS normalization using the native Elasticsearch Ingest Pipeline format, reduces volume before it ever hits Elasticsearch’s storage meter, and routes different data types to the right destination based on security value.

How does ECS normalization work, and do I need to set it up manually?

Elastic Common Schema (ECS) provides a standardized structure for security events. DataStream handles ECS mapping automatically: when logs arrive from a supported source, the multi-schema processing engine applies vendor-specific field mappings using the native Elasticsearch Ingest Pipeline format, validates the output against ECS schema requirements, and routes the normalized data to Elasticsearch. No manual parser writing or field mapping is required for supported sources.

How do I connect sources that don’t have native Elastic Agents or Beats?

DataStream supports both agentless and agent-based collection. Agentless collection connects directly via WinRM (Windows) or SSH (Linux, macOS, Solaris, AIX) with no software installation. For network devices, OT/ICS systems, and security appliances, it supports Syslog, CEF, LEEF, and REST APIs. Pre-built content packs cover all widely used vendors – Fortinet, Palo Alto, Check Point, CrowdStrike, CyberArk, Zscaler, and more – each activating automatically when logs from that vendor are detected.

How does DataStream handle sensitive data before it reaches Elasticsearch?

DataStream applies policy-based redaction and masking in the pipeline, before data leaves your environment. You define rules through a no-code UI to automatically remove or obfuscate sensitive fields: usernames, passwords, tokens, PII such as email addresses and phone numbers, or any custom field. Redaction is applied consistently across all incoming data, eliminating the risk of uneven coverage from manual processes. The structure and security context of each log remain intact, so Elasticsearch detection rules and correlations continue to work accurately. Policies are designed to support GDPR, HIPAA, and PCI DSS requirements.

Can DataStream send data to both Elasticsearch and cloud storage simultaneously?

Yes, multi-destination routing is a core capability. DataStream can simultaneously send ECS-normalized, security-relevant events to Elasticsearch or Elastic Cloud for real-time analytics, full data volume to Elasticsearch storage tiers for long-term retention, and raw logs in JSON or Parquet format to AWS S3, Azure Blob Storage, or Google Cloud Storage for archival at a fraction of Elasticsearch’s storage cost. Each destination receives the appropriate data tier based on security value, with a Correlation ID linking optimized Elasticsearch data back to complete raw logs in archive storage.

How long does it take to deploy DataStream for Elasticsearch?

Initial deployment setup takes under 30 minutes. DataStream’s guided configuration automatically handles Elasticsearch authentication, endpoint configuration, and pipeline routing – no manual Elasticsearch cluster reconfiguration required. The actual time-to-value depends on the complexity of your data sources and normalization requirements. Contact our team for a technical consultation on your specific deployment timeline.

Talk to our experts

Schedule a technical session with our engineering team to see how DataStream compares to what you’re running today.

Try DataStream

Route data to your SIEM in the correct schema, with automatic normalization and up to 90% data volume reduction.

Try now