Databricks

Artificial Intelligence·San Francisco, CA·Website

1 following on Crush

Data and AI platform built on the lakehouse architecture.

Getting hired at Databricks

Databricks is the clearest embodiment of the "enterprise AI company" thesis: serious data infrastructure roots, strong open source DNA, and an AI/ML platform built on top of both. It's also one of the most consistently interesting engineering destinations for people who care about data systems, ML infrastructure, or enterprise-scale product engineering. Understanding what makes someone fit well there requires understanding how these different parts of the company actually relate to each other.

The data + AI intersection, practically

Databricks started as a company to commercialize Apache Spark — a way to run Spark without the operational pain of managing it yourself. That framing has evolved substantially. The current positioning is around a "Data Intelligence Platform" that connects data warehousing, ETL, ML training, and model serving under one roof.

The technical work reflects this breadth. There are teams working on the query engine and optimizer (genuinely hard database internals work). Teams working on ML training infrastructure and distributed training frameworks. Teams working on MLflow, which Databricks maintains as an open source project. Teams working on Unity Catalog (their data governance layer). Teams working on Delta Lake, the storage layer that's become a standard in the data lake space. Teams working on the AI assistants and natural language interfaces that sit on top of all of it.

This breadth is both an opportunity and a potential source of confusion for candidates. "Working at Databricks" can mean very different things depending on which team you're on. The interview prep and positioning that works for a query engine role is different from what works for an AI platform role.

Serious ML/data engineering, concretely

If you're a data or ML engineer, Databricks is one of the few companies where the technical challenges are genuinely first-order. You're not building data infrastructure to support an AI product — data infrastructure is the product.

What serious looks like at Databricks: engineers who have worked on distributed execution engines, who understand the physical layer of query planning, who've dealt with the complexity of making distributed ML training fast and reliable at scale. People who've worked on columnar storage formats, or built vectorized execution kernels, or worked on GPU scheduling for training workloads.

This is not a company where you "get into data" from a more general software engineering background. The talent bar is calibrated to people who've spent meaningful time in the domain. What they're less interested in: people who've used Spark or Databricks as a practitioner but haven't worked at the infrastructure layer. The distinction matters in interviews because interviewers can quickly tell which side of that line you're on.

Open source DNA and what it means for culture

Databricks was founded by the creators of Apache Spark, and the open source relationship is fundamental to how they think about the business. MLflow, Delta Lake, Delta Sharing, and other projects are maintained as open source with Databricks engineering deeply involved. This isn't just marketing — it shapes the engineering culture in concrete ways.

Databricks engineers tend to be unusually comfortable with their work being public, scrutinized, and debated in open forums. They write design documents with the expectation that the community will engage critically. They make architectural decisions with an eye toward what will be defensible to a sophisticated technical audience outside the company.

For candidates with open source backgrounds: this is an environment where that experience is genuinely valued and understood. Contributions to major data ecosystem projects — Spark, Flink, Arrow, dbt, Iceberg — read as strong signal. Not just as a credential, but because it suggests familiarity with the design thinking and community dynamics that Databricks operates within.

For candidates without open source backgrounds: not disqualifying, but you should be able to speak to why the mission of building in the open is interesting to you. Generic answers about "giving back to the community" land poorly. Specific answers about the technical benefits of the open source model — better security review, faster ecosystem adoption, honest feedback loops — land much better.

Enterprise sales motion and what that means for product and engineering

Databricks is an enterprise company. Revenue comes from large organizations paying significant amounts for platform access. This has implications that aren't always obvious from the outside.

The product roadmap is heavily influenced by what large enterprise customers need — compliance, security, governance, integrations with existing enterprise tooling, and support for the specific cloud environments (AWS, Azure, GCP) that enterprises have standardized on. Features that are technically interesting but don't move enterprise deals don't get prioritized in the same way.

For engineers who care purely about technical craft: this is worth knowing in advance. The work is technically sophisticated, but the product decisions are enterprise-driven. You will ship features whose primary value is making a specific customer's security team comfortable, not advancing the state of the art.

For engineers who are interested in how technical depth meets commercial reality: Databricks is a good place to learn this. The sales cycle is complex and the engineering team has to understand customer needs at a depth that's uncommon in consumer or developer-focused companies. If you've ever been curious about what "enterprise software" actually means from the inside, and you want to learn it in a technically serious environment, this is one of the better places.

PMs at Databricks need to be comfortable with customer advisory boards, specific enterprise requirements, and the dynamic where a single large customer's needs can significantly influence the roadmap. This is not a product role for people who prefer working from user research and aggregate signals. The "users" are data engineers at Fortune 500 companies, and their voices are loud and specific.

The IPO trajectory

Databricks has been one of the most anticipated private-company IPOs for several years. Valuation has fluctuated with market conditions, but the business is clearly on a public company trajectory. This matters for how people think about compensation and equity.

For people joining at this stage: the upside is real but compressed relative to where it was three or four years ago. The risk profile is lower — this is a company with genuine revenue, not pre-revenue speculation. The equity math makes sense as part of a compensation package at a grown-up company, not as a lottery ticket.

The IPO timeline has remained uncertain enough that candidates should be skeptical of anyone who claims to know when it's happening. What's reasonable to expect: the company will go public eventually, the equity will be liquid eventually, and the time horizon is probably measured in 1-3 years rather than 5+.

Databricks opens senior roles across its data and AI platform regularly, but the specific roles worth your attention depend heavily on which team is relevant to your background. Follow Databricks on Crush and set your preferences — we'll surface the right signal without the noise.

Open roles(20)

Sr. Solutions Architect

On-site

Full-time·yesterday