Own Data + Open Weights: The Enterprise AI Equation

VZ editorial frame

Read this piece through one operating lens: AI does not automate first, it amplifies first. If the underlying decision architecture is clear, AI scales clarity. If it is noisy, AI scales noise and cost.

VZ Lens

Through a VZ lens, this is not content for trend consumption - it is a decision signal. The strongest enterprise stack combines proprietary data with controllable open models. This is where strategic differentiation compounds. The real leverage appears when the insight is translated into explicit operating choices.

TL;DR

The focus of an enterprise AI strategy is not on building a proprietary foundation model, but rather on a unique combination of open model weights and private data assets. This “proprietary data + open weights” formula, complemented by an internal measurement system and deep integration into workflows, creates a sustainable competitive advantage. A concrete example of this is a CRM system that uses an open model fine-tuned with data from its own sales interactions to provide real-time recommendations for next steps.

Some corporate debates about AI strategy revolve around the wrong question.

The wrong question: Should we build our own foundation model?

For most companies, the answer is clear: no. Developing a foundation model requires billions of dollars in investment, massive compute capacity, and a dedicated team of ML researchers. This is the domain of just a few dozen organizations worldwide.

The real strategic question is: how does an organization combine open models with its own data advantage?

The Formula

The increasingly powerful corporate AI formula consists of four elements:

Open weights (open model weights) + Private data (own, private data assets) + In-house evaluation (internal measurement system) + Workflow integration (embedding into own processes)

These four elements together—and only together—create a lasting competitive advantage.

Why?

Without Open weights: vendor lock-in, ongoing API fees, limited customizability
Without Private data: the model cannot leverage what the organization uniquely knows
Without In-house evaluation: it is impossible to measure whether the AI is actually better at the specific task
Without workflow integration: AI capabilities are not integrated into actual operations, so their value does not materialize

Why is this important now?

The maturity of open-weights models

By 2024–2025, open model families—Llama 3, Mistral, Gemma 3, Qwen2.5, Phi-4—will have achieved production-grade quality for most enterprise use cases. The “but closed models are better” excuse applies to an increasingly narrow domain: the most complex, open-ended tasks.

This means that an increasing proportion of companies are reaching a decision point where an open-model-based strategy is a truly viable option—not a forced compromise.

Private Data Assets as an Underutilized Resource

Most companies have accumulated vast, untapped data assets:

CRM data: customer interactions, successful sales patterns
Internal documentation: procedures, best practices, decision-making logic
Process logs: error patterns, exception handling cases, quality control data
Domain-specific dictionaries and nomenclatures

Precisely because this data is private—because it is derived from the organization’s internal operations—it carries knowledge that a general-purpose model will never contain.

When this data is used to fine-tune an open-source base model, the result is an AI system with unique domain knowledge.

What has changed from a business perspective?

Two years ago, corporate AI strategy was largely a question of “which API should we subscribe to?” Today, strategic decisions are much more nuanced:

Which model is best for which task?
Where does it make sense to use fine-tuning vs. RAG vs. prompt engineering?
What data should be used for internal fine-tuning?
How do we measure performance for our own tasks?

These are architectural and strategic issues—not just technological decisions.

Where has public discourse gone wrong?

Open models aren’t always better—but they’re different

The “open vs. closed model” debate is often treated as a sharp dichotomy. The reality is more nuanced.

Closed frontier models are generally more powerful—OpenAI o3, Claude 4 Opus, and Gemini 2.0 Ultra are the best in terms of general intelligence. If the AI task is open-ended, complex, creative, or one-off in nature—these are the default choices.

Open models should be preferred in cases where:

the data is sensitive and cannot leave the organization
the task is well-defined and can be fine-tuned
the scale is large and the inference cost is a key factor
the need for customization is high
the risk of vendor lock-in is unacceptable

This is not an ideological decision. It is a risk- and cost-profile decision.

The difference between a “proprietary model” and a “proprietary AI system”

An important distinction: proprietary model vs. proprietary AI system.

It makes sense for large companies to build their own models. Every organization should have its own AI system.

An in-house AI system is not about owning the model. It is about the organization:

understanding its own tasks and data assets,
has built an internal evaluation infrastructure,
has integrated AI into real-world workflows,
and possesses the capacity for continuous iteration.

This system—not the model—is the true competitive advantage.

What deeper pattern is emerging?

Data Assets as a Moat

One of the least understood sources of competitive advantage in AI strategy is internal data assets.

Every organization has data assets that others cannot replicate:

proprietary customer data and interaction patterns
proprietary manufacturing and quality control data
proprietary research and development materials
proprietary compliance and legal documentation

When this data is used to fine-tune an AI system—specifically an open model that can run on your own infrastructure—the result is unique. Competitors cannot replicate this because they lack the data.

This is the combination of data sovereignty and domain-specific AI—and this is where enterprise AI builds a true, lasting moat.

Integration as a differentiator

The value of an AI system stems not from the model’s performance, but from the depth of integration.

A CRM system where AI suggests next steps in real time based on customer interactions—and where these suggestions come from a model fine-tuned on the company’s own sales data—creates fundamentally different value than a general AI assistant that salespeople occasionally consult.

The depth of integration and internal data assets—these are the sources of lasting competitive advantage.

Why isn’t this an isolated trend?

The emergence of the “open weights + private data” formula is part of a broader enterprise AI maturity cycle.

In the first wave (2022–2023), companies begin using AI: ChatGPT, Copilot, general-purpose APIs.

In the second wave (2024–2025), differentiation begins: who can build a deeper, more specific, better-integrated AI system?

In the third wave (2025+), data assets and evaluation infrastructure will become the key differentiators—and this is where the “open weights + private data” formula gains traction.

What are the strategic implications of this?

Data inventory as the first step

The first step in applying the formula: data inventory. What internal data assets do we have that could potentially serve as raw material for fine-tuning?

Evaluation criteria:

Quantity: Is there enough data? (Typically, several thousand to tens of thousands of examples are needed.)
Quality: Is the data reliable, consistent, and free of bias?
Relevance: Is it relevant to the desired AI task?
Sensitivity: Is anonymization or aggregation necessary?

Building the evaluation infrastructure

The second key element of the formula: a proprietary evaluation system. Without this, it is impossible to measure whether the fine-tuned model is actually better at the specific task.

Components of the evaluation infrastructure:

Golden set: manually curated test data containing expected outputs
Automatic metrics: task-specific metrics (e.g., accuracy, recall, F1)
Human evaluation: where automatic metrics are insufficient
Regression tests: ensure that a new iteration does not degrade previous results

Workflow integration as the final—and most important—step

The value of an AI system materializes when it is integrated into the actual workflow.

This isn’t always the most exciting task—integration is often mundane, repetitive engineering work. Yet this is what turns an AI investment into a return on investment.

What should you be watching now?

What can we expect in the next 6–12 months?

RAG + fine-tuning hybrid architectures. Retrieval-augmented generation (RAG) and fine-tuning are not alternatives—they are complementary approaches. In the coming period, companies will achieve the best results with hybrids.

The emergence of the data quality industry. As more and more organizations adopt fine-tuning-based strategies, data curation, anonymization, and augmentation will become an industry segment.

Building internal AI expertise. The shift from deciding “which API to subscribe to” to deciding “how to build an internal AI system” will also be visible at the HR level—demand for roles such as ML engineer, data curator, and AI product manager will grow.

Conclusion

Most companies’ competitive advantage does not lie in who can train a better general-purpose model.

Rather, it lies in who understands their own operational logic best, who possesses the best internal data assets—and who can combine these most effectively with available AI infrastructure.

The new corporate formula for AI: open weights + private data + proprietary evaluation + workflow integration.

This isn’t the easiest path. But it’s the one that translates into a lasting advantage.

Key Takeaways

The strategic issue is not model ownership — For most companies, the real question is how to leverage their private data assets with open model weights, not how to build their own base model.
Sustainable competitive advantage is built on four elements — The combined use of open weights, private data, internal performance measurement, and workflow integration is necessary for AI to create real business value.
Private data is the most important moat — A company’s unique customer, manufacturing, or documentation data provides domain knowledge to the fine-tuned model that competitors cannot replicate.
The maturity of open-source models has changed the game — Open-source model families like Llama 3, Mistral, and Qwen2.5 have achieved production-grade quality, offering a viable, customizable alternative to closed APIs.
The strategy is about your own AI system, not the model — The real value lies in understanding internal tasks, building evaluation infrastructure, and deeply integrating AI, not in the model itself.

Strategic Synthesis

Translate the core idea of “Own Data + Open Weights: The Enterprise AI Equation” into one concrete operating decision for the next 30 days.
Define the trust and quality signals you will monitor weekly to validate progress.
Run a short feedback loop: measure, refine, and re-prioritize based on real outcomes.

Next step

If you want your brand to be represented with context quality and citation strength in AI systems, start with a practical baseline and a priority sequence.

Start with AI Scorecard Browse Hungarian originals