The Generative AI Wave of Human-Machine Interaction

June 24, 2024

Increasing Access to Intelligence.

In the realms of technology and artificial intelligence, we’re witnessing a profound shift in how humans interact with machines. This shift, largely driven by the advent of Generative AI (Gen AI), is shaping a major inflection point in human-machine collaboration. Up until this point, humans have largely needed to adapt to machine/computer interaction in order to reap the benefits of software automation. With Gen AI, we suddenly have the power for machines to evolve away from prior UX paradigms and more closely adhere to how humans interact. We believe many of our traditional software UI constructs may soon disappear, as we begin to see application interfaces that are more dynamic, hyperpersonalized, and completely context aware thanks to multiple sensory interactions. We’re moving towards a true collaboration and co-creation partnership between humans and machines, something we have never seen before.

In this paper, we start by looking back at history. This isn’t the first platform shift, so looking at prior waves to infer underlying trends will help us make better predictions about the future.

We then cover why we think this wave is different, and propose a New Work Anatomy that reflects how Gen AI has the potential to transform the way we work completely.

For those looking for a technical understanding of how these applications are built, we lay out the building blocks of the tech stackfor GenAI applications, including multimodal considerations. It’s important to note that this stack is constantly changing, so the one we describe may change drastically by the time of publishing and will continue to do so rapidly.

We then focus on key opportunities and risks. There are opportunities to disrupt dormant incumbents, to provide builders with tools to supercharge their capabilities, to think about solving the new set of bottlenecks to future-proof the mass adoption of Gen AI, and most importantly, to create new disruptive and more productive user experiences that recognize the step-change that Gen AI provides for UX design paradigms. On the flip side, risks include underestimating incumbents, underestimating the rate at which the foundation models progress, overpromising the capabilities of transformer model architectures, and the inability to extract economic rent from the value being created.

As we wrap things up, we are filled with immense excitement for the future. For the first time in human history, machines have the potential to act as collaborative partners rather than mere subservient tools. This shift will significantly impact the future of human-machine collaboration. We genuinely believe that there’s never been a better time to start a company.

Waves Come in Sets

In oceanography, a “set” of waves refers to a group of waves that travel together in the same direction with similar characteristics, such as wavelength, amplitude, and speed. Sets are commonly observed in the open ocean, where waves are generated by wind, storms, or seismic activity. In a typical set of ocean waves, one wave frequently stands out as larger or more powerful than the others.

Major technological shifts also come in waves. Every 10 to 15 years, a new technological step-change resets industries and generates a proliferation of new jobs and businesses. These shifts allow new players to enter the market and force incumbents to adapt to a new reality. These are also periods in which fundamental standards and protocols are introduced and become bedrock components of tech development and deployment, like HTML, TCP/IP, compression algorithms, and many more.

We focus here on relatively modern technology waves consisting of mainframes, PCs, the Internet, and mobile. Gen AI (where we are now) is likely the largest and most disruptive wave in the last fifty years.

Platform shifts chart: Mainframes to PCs to Web to Smartphones to Gen AI

Source

The pattern is clear: all previous technology waves have progressively increased access to technology and boosted collaboration between humans and machines.

The Mainframe Wave (1950s–1970s): In the early days of the mainframe computer, large (and relatively few) expensive machines were locked in secure facilities and manned by an elite cohort of operators. Output was distributed solely via hard copy or, at best, proprietary networks of “dumb terminals.” Nonetheless, this era began to usher in large-scale data processing for large enterprises. Still, consumption was characterized by relatively few users.

The Personal Computer Wave (1980s–1990s): Desktop and laptop computing dramatically increased access to computing power, taking it from centralized locations managed by specialized personnel and putting it into the hands of individual users and businesses. Still, there was no easily accessible network to allow widespread communication among individual nodes. This wave firmly introduced transformative milestone innovations like the spreadsheet, the word processor, and the graphical user interface, driving the computer to accommodate native human interaction instead of humans accommodating native computer interaction as was the case in the prior wave.

The Internet Wave (1990s–2000s): The web changed everything, marking a paradigm shift toward global interconnectedness and transforming how information is accessed, shared, and consumed. This digital age reshaped our social interactions and gave rise to instant communication, e-commerce, and streaming. Regarding human interaction, the expansion of the internet ushered in massive innovation in interactive, human-centered design. While the internet in its infancy was little more than a near-static broadcast medium, it quickly evolved into a responsive, context-aware platform that mirrored the applications users were familiar with on their PCs. This was made possible via newly introduced interaction-enabling technologies like HTML, CSS, javascript, AJAX, and many more, all of which enabled hugely successful new companies.

The Mobile Wave (2000s–2020s): The PC and Internet waves inevitably hinted at a ubiquitous and persistent relationship with machines and access to the web, its applications, and services that the mobile wave brought to a dramatic new level. It made the Internet pervasive in daily life and enabled wholly new categories of applications through developments like constant connectivity and location-based services. It also allowed many people in emerging markets to access a reliable and affordable Internet connection for the first time.

Once again, the demand to accommodate human interaction more creatively, compared to humans accommodating computer interaction, saw a step function increase (e.g., touch UIs, early NLP, dynamic context setting via GPS, etc.). As in prior waves, we saw enormous innovation and requisite value creation in thousands of startups establishing new standards, protocols, and proprietary platforms, including application stores, developer tools, and social networks. We again, moved much closer to machines conforming to human behavior, vs. humans continuing to conform to computer requirements.

A New Work Anatomy

Gen AI will very likely be the most transformative wave within the set we’ve described above. While prior waves introduced significant advancements in technology and computing infrastructure innovation to facilitate accessing and sharing information, the Gen AI wave promises to be far more disruptive, providing us with unprecedented access to intelligence.

Until now, businesses have primarily relied on labor specialization — the division of work into distinct tasks, roles, or processes within a production system. More specialization means more rigidity, so businesses have traded flexibility and adaptability for increased efficiency. Gen AI will completely change the way we work through its ability to dramatically incorporate contextual data that prior systems have only attempted to, such as voice (NLP) and audio, vision (CV), location, movement, and eventually haptics, and more. By pairing our natural intelligence with artificial intelligence, Gen AI can handle an increasing number of human tasks, enhancing our capabilities and eliminating the specialization versus flexibility trade-off that has characterized human progress thus far. AI-powered systems will provide specialization, in sync with humans focusing on higher-order tasks like monitoring, assessing, and adapting.

As Gen AI adoption increases, humans will maintain oversight of end-to-end systems, managing and improving them holistically, while AI will plan and execute every detailed task in the most efficient way possible. Humans and machines will collaborate, making this co-evolution deeply transformative. Humans in collaboration with Gen AI-powered systems have the potential to reduce the monotony of daily tasks, the redundancy of steps in a process due to business unit silos, and the vulnerability of a business to economic shifts given the higher degree of adaptability and flexibility. And the effects compound. We are incredibly excited about the direct and indirect impact this will have on society at large.

The Application Stack

To better understand the market and where we see key opportunity areas evolving, we’ll begin by describing a contemporary tech stack for building Gen AI applications.

The diagram below shows many similarities with previous stages of machine learning (ML), but it also shows major differences.

Model Architecture: Traditional ML models like convolutional neural networks (CNNs) and recurrent neural networks (RNNs) are designed for specific tasks like image recognition or natural language processing. In contrast, Gen AI models like large language models (LLMs) and diffusion models are built on transformer architectures and trained on vast amounts of diverse data, allowing them to generate new content across various domains.

Training Approach: Traditional ML models are typically trained on curated datasets for specific tasks using supervised or unsupervised learning techniques. Gen AI models are trained on massive amounts of unstructured data using self-supervised learning approaches.

Inference: During inference, traditional ML models take input data and produce outputs like classifications, predictions, or translations. Gen AI models, on the other hand, can generate entirely new content like text, images, or audio based on prompts or seed data. This, in turn, impacts the types of applications we can now build. Instead of narrowing the focus to recommendation engines or predictive analytics, Gen AI applications enable open-ended content creation, such as writing articles or composing music.

Orchestration: while the orchestration layer in traditional ML focuses on streamlining the ML workflow, the orchestration layer in generative AI systems has additional responsibilities related to managing multiple LLMs, prompt engineering, real-time data integration, continuous learning, and governance.

At the bottom of this stack is the data layer. This layer involves data collection, preparation (cleaning and normalizing), and storage. It’s arguably the most important part of any AI application since proprietary data and/or proprietary ways of leveraging data (unique data taxonomy) can create differentiation. On the other hand, improper or negligent attention to data preparation can introduce bias and/or otherwise impact the rest of the stack: garbage in — garbage out. Different data source types are tokenized, processed through an embedding model, and typically stored in a vector database.

The infrastructure layercomprises pre-trained models, including proprietary model access via APIs and open-source models, some of which can be downloaded and hosted on company servers or private cloud instances (or even, eventually, on-device and/or air-gapped from the cloud altogether). This layer encompasses all computing and hosting “picks and shovel” technologies to facilitate model training and inference workloads. Depending on the stage and scale of the company, different alternatives make more or less sense. Options include hyperscalers (e.g. AWS, GCP), on-demand and guaranteed serverless (e.g. Mosaic, Lambda), or access to bundles such as OpenAI and Azure.

The orchestration layer comprises the glue that pulls everything together. Instead of focusing on a specific task, this component acts as a unifying layer. In response to a user query, an orchestration layer pulls in the right data from the data layer, enriches via context retrieved from other APIs, and submits the aggregate with the right mix of prompts and examples for model inference. Chaining sequential API calls might be beneficial or required for more complex use cases.

The testing and evaluation layertypically marks the difference between a demo and a production-ready application. This includes incorporating validation checks to ensure that LLM responses are accurate and appropriate (depending on the application) and that answers are delivered efficiently. Iterative testing of various component orchestrations will improve the reliability of outputs for different tasks. Developers can then use and reuse these modules so they don’t have to create them from scratch whenever a user interacts with an application.

Finally, user-facing applications sit at the top of the stack. This is the interactive layer between humans and machines, and the access point for us to leverage the power of distributed intelligence smoothly and intuitively.

There’s a significant amount of innovation yet to be seen across all stack layers, especially in developing production-ready components within the guardrail layer. Much value will be created in these efforts and in this wave. We are tracking all of these, but are particularly excited about how we evolve our (human) interaction with next-gen technology beyond the “text field with a blinking cursor”, as a way to unleash the full potential of Gen AI. Human interface innovations must evolve well beyond what we have been accustomed to since the earliest mainframe era. In doing so, we believe that significant new value will be introduced, predominantly by disruptive innovative startups running ahead of risk-averse or “dormant” incumbents.

Key Opportunities

The question of who will be the biggest beneficiary of the Gen AI platform shift is still an open one. While incumbents benefit from data and distribution advantages, there are significant opportunities for startups and smaller enterprises to be competitive. The real answer to this question is “it depends” because market forces differ across industries. Factors like speed and cost to innovate, concentration of power, elasticity of demand for the product or service, minimum accuracy thresholds, interconnectedness, and penetration of digitization can tilt the balance in favor of startups.

We will explore four categories of opportunities we’re actively targeting:

Disrupting Dormant Incumbents
Empowering the Builders
Future-Proofing Gen AI
New User Experiences

(1) Disrupting Dormant Incumbents

In the Bay Area, where tech has the highest share of employment compared to all other markets (and well above the US average), it’s easy to forget that most of the benefits of digitization in the last decade were captured by few, large industries. All it takes is a short drive outside Silicon Valley to realize that the majority of industries are still largely driven by legacy software systems, sometimes even pen and paper.

In oil and gas, process engineers complete daily hazard and safety tasks manually, employing traditional “pen and paper” checklist methodologies. Established manufacturing software vendors operate on-prem only (many of them undergoing a slow shift to the cloud), charging for each new product version, which is usually only just a bit faster than its predecessor to justify the upgrade. These antiquated systems are used to design, test, and build all physical goods, from the furniture we sit on to complex car assemblies. The problem isn’t just old systems and data silos, but the way we get the job done, involving many specialists who aren’t in sync with one another. There is a significant lack of communication and understanding among different departments and teams. As a result, technological incompatibility issues, communication barriers (different terminology and use of jargon), operational silos, and a lack of standards across the organization manifest. Mistakes happen often, redundancy is the norm, and progress is being constantly delayed.

But it doesn’t have to be this way. With intuitive AI-powered solutions, startups have the opportunity to completely transform the way these industries operate. The incumbents, due to their sunk cost considerations and the fear of refactoring legacy systems, are far less likely to do so rapidly. They will most likely engage in incremental innovation and focus on adding AI features rather than true ground-up AI-powered solutions.

Let’s look at software for hardwareas an example. Over the last 50 years, companies such as Siemens, Dassault, and Autodesk have built CAD and PLM empires. However, their ability to remain competitive is being challenged by new technologies such as the cloud, high-performance computing hardware, and Gen AI, paired with changes in demand from greenfield hardware companies such as SpaceX and Anduril, as legacy software simply can’t keep up with their pace and unique requirements. In other words, the old software companies are slowing progress for new hardware companies. Gen AI can change this.

Gen AI-powered software can provide single-user benefits by reducing the time and cost it takes for engineers to design and simulate new lines of products. The larger opportunity however lies beyond individual time savings and happens when multi-user functionality and collaboration (human-to-machine and machine-to-machine) are embraced. Today, design engineers have to get in line for their design blueprints to be tested on expensive simulation (physics and functional design based) engines. With Gen AI powered solutions, not only can the individual designer generate more versions of their design, but they can also simulate and move on faster. In addition, if a Gen AI-powered design engine can communicate directly with a Gen AI-powered simulation engine (and not go through multiple human reviews), design cycle time collapses significantly, and the human touch efforts required to get a design ready for prototyping decrease by an order of magnitude. Furthermore, the ROI of Gen AI-powered design engines can expand further as downstream procurement data is incorporated in design feedback. With this approach, optimal design can be achieved not just from a mechanical standpoint but also from a supply chain standpoint (eg. understanding how selecting a certain material or tolerance might impact the number of suppliers one can work with and how much they will charge). This approach ensures the incorporation of proper design for manufacturing (DFM) principles from the very first step.

Inspired by our New Work Anatomy, we believe that soon, hardware manufacturers will work with one supercharged (by AI-powered software for hardware) “process guide” engineer instead of ten design and simulation engineers. Startups that seize the opportunity have a great shot at distancing themselves from incumbents, as the latter will have a hard time shifting from their current business model reliant on (many) licenses. AI-powered newcomers won’t have to compete at the per-user license level, as they have the potential to capture larger budgets for hiring personnel.

Challenges remain, and startups should remember that while they may be slow to innovate, software for hardware incumbents have a stronghold on the market. Many of their enterprise customers have made significant investments to integrate a product suite into their organizations and to train their staff, so a better product that only adds value as a point-source solution is just not worth it. Founders should take this into consideration when they think about their initial ideal customer profile (ICP) and expand their definition as they build and extend their platforms’ capabilities. To explain this further, while it might be attractive for startups to begin by complementing an incumbent with a point-source solution, that won’t be enough to unlock venture scale. At some point, the startup will have to compete directly to create greater value and win market share.

Another trap for startups is to start too far from their grand vision, a path that requires too many steps (and too many “ifs”) to get to relevancy, reducing the chances of success and impacting how investors underwrite the opportunity. As an example, a company that’s first innovating a design review process as a part of quality management, then creating a more comprehensive QA/QC tool, then finally moving upstream into generative design, might find themselves facing too long a path to get to relevancy. We believe the initial beachhead should clearly advance the startup’s ability to win in design from the get-go, and not be a simple multiplication of independent events that lead to a very low probability of success.

The case for Gen AI to disrupt traditional industry goes beyond hardware design and manufacturing. Its application has the potential to provide both single-player benefits to individual employees and multi-player value in collaboration across functions in sectors as varied as construction, logistics, legal, and insurance. An interesting space to track will be AI agent powered services competing with human-powered services and traditional software solutions. The potential is to capture a much larger addressable market than incumbents.

Many companies have traditionally outsourced functions such as legal, accounting, and customer support to third-party service providers that are not AI-augmented. These services businesses have traditionally faced scaling challenges, as they must hire more people in order to serve more customers. In the case of AI-augmented service providers, the scaling challenge disappears, as AI can handle work that was previously done by a group of skilled workers.

The logic stands when we think about functions that are absorbed in-house, with the support of traditional software products. A startup that provides AI agent-powered services is incredibly compelling, due to the potential market size it could capture. Given that these companies have the opportunity to not just augment human workers (as previous software solutions have done) but to also reduce the number of people needed to get the job done, their capture expands beyond an industry’s software spend, and into the annual labor spend. As the chart below shows, Palo Alto Networks, a leading provider of cybersecurity solutions generates about $8 billion in annual revenue. Their customers, however, spend $375 billion on in-house security experts each year. An AI-Agent powered cybersecurity solution that doesn’t just simplify/augment a security employee’s workflow, but one that reduces the total amount of security employees by executing part of their tasks, has the opportunity to capture a portion of that headcount expense — a much larger potential addressable market.

Revenue vs annual salary spend chart: Workday, Palo Alto Networks, GitHub, Salesforce

(2) Empowering the Builders

The quest for impeccable accuracy, often seen as paramount in previous waves of AI, has proven a significant hurdle for businesses aiming for full autonomy. This goal demanded substantial human intervention, resulting in businesses with high headcounts and low-profit margins. Consequently, early AI startups that showcased rapid growth struggled to evolve into healthy businesses. This resulted in AI predominantly benefiting Big Tech, which could afford the luxury of investing heavily in computing resources and talent, overcoming AI’s notorious challenges.

The emerging wave of Gen AI applications promises to shift away from these ambitious and expensive requirements. This new era emphasizes the craft of applications that improve with user engagement and that don’t need 100% accuracy to be useful. By accessing models that are pre-trained and building solutions for use cases that are iterative in nature (eg. new product design), startups no longer have to bear the cost of building models from scratch and training them for perfection with a sophisticated team of data scientists and machine learning engineers. Developers and business professionals can now leverage AI capabilities to build what customers want, identifying appropriate use cases and harnessing the tech to drive business outcomes. Gen AI facilitates the inclusion of a broader workforce building with AI, moving beyond the realm of a few elite specialists.

Software developers represent a much larger talent pool than ML specialists. Providing the former with the capabilities to build AI-first applications is a massive opportunity. While access to pre-trained models removes a large barrier to software innovation, we need a lot more than performant models to put applications into production. Developers need robust infrastructure and tooling built explicitly for continuous production workflows. These tools must navigate and simplify the complexities of running and maintaining ML systems in a way that is familiar to developers. The use of predominant language tools, unified APIs, and code repositories to access these systems will characterize successful companies.

On the other hand, the elements of AI that developers aren’t familiar with (ie. data pipeline creation and management), should be completely automated, ideally with the support of AI agents purpose-built for these workflows. A clear opportunity exists to provide software developers with infrastructure and tools that make building AI applications similar to what they’re accustomed to. This way, businesses that don’t have the time or resources to set up an ML practice can reap the benefits of Gen AI with just a few in-house software developers. Consequently, the reach of Gen AI and the demand for AI-powered solutions can be met and expanded across the broader economy.

Robotics is another compelling industry for infrastructure abstraction. A similar finite pool of experts spends countless days and weeks programming new robots and then even more time deploying and maintaining them. Tooling is scattered, everything is custom-built, and deep technical expertise is needed to solve every one-off case. In order to meet the growing demand for robots and automation, we need companies to abstract away the complexities into comprehensive building blocks that can be accessed via an API or equivalent. Winners have the opportunity to become the standard operating system for programming robots, leveraged by a much larger pool of general software developers.

Challenges remain, as good models need to be powered by good data. Access to quality (and often proprietary) data to show meaningful ROI to the customer isn’t straightforward when you’re a newinfrastructure company. The cold start problem is real. You have no data initially, you need to generate trust with customers to gain access to their most valuable assets, and even if you’re successful, data practices will differ across customers. We believe founders must figure out how to build confidence and demonstrate the value of their solutions without asking for too much from the customer while avoiding the trap of custom builds that don’t scale. To do this, founders might start by significantly narrowing down their ideal customer profile.

Throughout the history of software development, there’s been a move towards higher levels of abstraction, such as high-level programming languages, the move to the cloud, distributed systems, and open-source libraries. We expect this to continue and accelerate, transforming the day-to-day job of a software engineer from executing customer feature requests and running performance optimizations to ensuring that augmented development driven by Gen AI performs as expected. This shift signifies a clear movement towards a New Work Anatomythat’s been in the works for some time, now dramatically accelerated by Gen AI.

(3) Future-Proofing Gen AI

Gen AI will power significant innovation across all business sectors, including but not limited to media, healthcare, and entertainment. The extraordinary capacity to create content, simulate scenarios, and automate complex tasks highlights its transformative potential. At the same time, many questions arise when thinking about scale. There’s an opportunity to create new frameworks that can ensure resiliency and efficacy in an ethical way.

“Hallucinations are a feature, not a bug” is a phrase that captures an important consequence of how transformers work. Hallucinations aren’t bugs that can be quickly fixed, but rather a feature inherent to the current generation processes. Gen AI applications are built on a probabilistic engine and not a deterministicone, as is the case with traditional software. Fine-tuning helps reduce the number of hallucinations in a model’s output, as it puts more weight on the desired outputs identified in the training process and adjusts the model’s internal parameters. Still, it doesn’t mean the probability of hallucinating drops to zero. In some creative domains like writing, brainstorming, and even problem solving, this is completely acceptable. In others such as healthcare, law, and education, there is zero tolerance for inaccurate results. Although we generally prefer startups that begin by tackling creative use cases that are iterative in nature (more in the sections to follow), we also find there’s an opportunity for new startups to provide guardrails so that we can better manage model hallucinations.

New tools to manage hallucinations must accommodate the needs of the new AI builders. Even large, sophisticated software companies struggle with model testing (see The Stack) when building Gen AI applications. Many of their model or combinations of model selections come down to guesswork and instinct, as they don’t really know what changes improve their product’s quality and to what extent an improvement manifests. The problem isn’t that AI models and systems don’t have existing evaluation methods, in fact, evals have always been part of the core ML workflow. However, these metrics don’t lend themselves well to software developers who aren’t interested in grasping these metrics and are more concerned about capturing the business case accurately. Furthermore, traditional out-of-the-box evals are a “one size fits none” solution, as what constitutes a good output is subjective and particular to the specific use case. We believe founders building in this space should think about how they abstract the evaluation metrics to make it intuitive for the common software developer. They, too, might consider how to create a scalable evaluation process, as generating unique evaluation datasets for every testing instance is a big lift.

As AI permeates every part of our economy, new opportunities that benefit from its unpredictable nature will arise. With AI as a collaborating force, companies can provide hyper-personalized customer experiences, optimally catering to many, narrow market niches. As more responsibilities are handed over to multiple AIs, there will be a growing need to manage and supervise them, to ensure all the content exposed to customers is safe and the generated output is compliant with company guidelines. This need can be extrapolated to many use cases and many industries; the opportunity is massive.

Challenges remain, as we’re still very early in the cycle. It’s not a matter of if but when AI is incorporated into every aspect of our economy, and at the same time, being too early to market before there’s significant demand to manage AI at scale, is as good as being wrong. We’ve learned that even if the future problem of managing AIs at scale is clear to the customer, solving it falls in the priority stack unless they’re actually feeling the pain. We believe deep customer discovery and astute go-to-market tactics to provide customers with the value they need now are of the essence.

(4) New User Experiences

The last fifty years of technological innovation have consistently trended towards creating machines that accommodate human interaction more creatively, rather than forcing humans to adapt to machine requirements. We see this trend accelerating at unprecedented speed with Gen AI, enabling entirely new user experiences that are immersive, personalized, and intuitively aligned with human behavior. We’re currently focusing most of our attention here, as we believe that companies that embrace real-time, adaptive, and ambient interfaces can set the stage for a new era of technological and experiential innovation. It’s not about the transformer models and training them anymore; in fact, it’s not even about inference workloads.

The next generation of industry giants will include those that audaciously innovate and transform user experiences, leveraging the radical potential of the Gen AI platform shift to its fullest. We’ve all heard that “all models will converge.” But what happens when all models are trained on the same vast reservoir — the World Wide Web? We’re already seeing somewhat of a homogenization of model capabilities and outputs as a result of this phenomenon. How might companies create differentiation if foundational data is universally accessible? There are different avenues, from new model architectures to UX improvements using AI assistants. We believe a “thick” UX layer, composed of proprietary interaction data with advancements beyond Reinforcement Learning from Human Feedback (RLHF) and a series of API calls, will create dynamic product experiences tuned to every individual’s needs and preferences.

Another source of differentiation will come from new data, which should have a significant impact on the related development of hardware and other sensor platforms. To capture more real-time, contextually relevant information, we will need to evolve from the traditional device and sensor form factors we are accustomed to. Smartphones, while smart and loaded with sensors, are not suitable for ambient, always-on, real-world data capture and use in their current form factor. The many experiments we are seeing in AR/VR devices and AI pins are evidence of this fact.

The smart ChatGPT bot interface that blew our minds in 2022 is a perfectly rational way to interact with AI in a user-friendly conversational format. However, when we think about knowledge workers and business use cases for AI, the chatbot interface overlooks too many greenfield user experience opportunities.

Being a knowledge worker in the current economy has become increasingly dynamic. On one hand, individuals are still required to complete a host of manual tasks during the day: wrangling emails, determining order quotes, and shepherding data in and out of databases. At the same time, distractions dominate, and staying focused is always challenging. Knowledge workers attempt to navigate the exponential sea of content, yet human cognition isn’t equipped to succeed. We slog forward, using legacy and familiar workflows that are inadequate, despite the fact that the world around us has changed due to AI.

What makes software valuable is the potential to free up human time. It allows us to reallocate that precious resource to scale our human output and do more than we would be able to otherwise in the same amount of time. The more things we can get done without having to think about doing them, the more we progress as a society. We try to portray this “generation of time” as a result of human-machine interaction and evolution below, where the gap between original human effort and enhanced cooperative effort will continue to increase, with a commensurate impact on process economics.

Human/AI task duration diagram showing time reduction from human alone to new process duration

So, what does the future of AI software look like? Integrated AI agent networks will complete tasks for us without being explicitly programmed to do so every time, and visual (and other) ambient interfaces will help to determine how and when to adapt to ensure we stay on task. Our software will understand us and won’t necessarily require that we learn how to use it. There will be no need to remember every keyboard shortcut, perhaps not even what software to use. Data entry fields, menus, and many of our traditional software UI constructs may disappear. We will begin to see application interfaces that are more dynamic in nature, far from the semi-static, highly structured interfaces we’ve known for fifty years. Some will perhaps have no visual aspect, as we continue to innovate in leveraging audio, voice, hyper-spectral / hyper-audible, and even haptic sensory interactions.

Joanna Maciejewska quote: I want AI to do my laundry and dishes so that I can do art and writing, not for AI to do my art and writing so that I can do my laundry and dishes

This is what is different with Gen AI. In prior technology waves, software focused on standard workflows to achieve scale, and as humans, we had to adapt our work to those standards in order to reap the benefits of automation. With AI software, we no longer have to adapt to a universal way of communicating with technology in service of a specific task. Instead, this wave of AI-first software with hyper-personalized experiences makes non-standard workflow and process automation scalable. We are actively evolving from humans conforming to machine requirements (e.g. input, output, interaction mechanisms, and patterns) to machines conforming to human requirements(e.g. “ambient” computing, new device forms, multi-modal ingestion, and expression).

Breakthroughs in AI and IoT / edge computing, paired with agents that can read our screens for full context and comprehend multimodal data inputs, are the essential characteristics for the evolution of a new software paradigm that deeply (and ambiently) understands human intent. This new generation of tools and systems will have the power to create personalized experiences with adaptive interfaces that show us what’s useful, hide what’s irrelevant, and automate as much as possible. As humans, we will dedicate more of our time to being orchestrators and strategists (our New Work Anatomy), having more time and freedom to focus on what we’re (still) better at.

Challenges remain. The biggest enemy may be ourselves, as adopting these new products requires a behavioral change that leaves AI skepticism behind. Despite potential productivity gains, old habits die hard, and people become attached to their workflows and supporting tools. Though extremely beneficial in theory, the business ROI case still needs to be proven at scale given how early we are in the cycle. We seek founders who think beyond conventional boundaries and are not afraid to be prescriptive about the future. Strong opinions loosely held are most important when leading the charge toward a new paradigm in human-machine interaction and collaboration.

Key Risks

In the preceding section, we explored the many opportunities presented by Gen AI, highlighting the transformative potential across every industry. All opportunities have associated risks, and generally speaking, we embrace them. However, when we think about risk-adjusted returns, certain areas and conditions should be avoided.

The risk factors that challenge the defensibility and sustainability of companies in this space are broad. They range from over-reliance on third-party platforms to challenges in differentiating products in crowded markets, dependencies on regulatory environments that are still in flux, and the profound implications of ethical missteps.

Thin Veneer Applications

For many startup application ideas, merely adding a superficial layer of functionality or customization atop third-party, foundational models can expose startups to several vulnerabilities.

Companies where dependence on a third-party model is critical to their offering will find that value doesn’t accrue to them but to the underlying model. This inability to extract economic rent from the value created is problematic, as the startup will likely have difficulty capturing a meaningful margin that justifies a venture investment. Despite being pre-seed investors, we always have the end state in mind.

Many examples showcase why over-reliance on third-party models or services can be dangerous, placing startups at the mercy of external changes such as updates or discontinued services. We’ve seen firsthand how adaptations in regulation, market shifts (and strategic inflexibility at the company level), and cost increases can suddenly limit or outright negate a startup’s growth trajectory. We seek founders building with a model-agnostic architecture, so as new models come out, they can quickly shift their core engine to the best-performing one for their particular use case. When it comes to applications, we seek teams that are rigorous about building a thick veneer, via a differentiated UX and/or unique data.

A quick side note: sticking to one model (arguably the easiest to work with) when prototyping ideas makes sense. However, as that idea becomes grounded and expands, cost and latency considerations that go beyond performance and ease of use must be accounted for.

When we look “under the hood” of any given application, there are many other factors to consider. One methodology we’re particularly wary of is recursive training, where AI-generated content is increasingly used as training data for future iterations. While efficient, this isn’t always effective, particularly as the human in the loop is increasingly removed. Recursive training can lead to the propagation of errors and biases, as inaccuracies in AI-generated data can be reinforced and amplified in subsequent iterations. Additionally, the diversity and quality of training data may degrade over time, resulting in models that are less robust and more prone to overfitting on flawed data patterns. This process could ultimately diminish the model’s ability to generalize effectively to new, unseen data, undermining its reliability and performance.

The Steamrolled

Each version of GPT has been trained to improve on specific capabilities compared to its predecessors. GPT-3’s biggest improvement was in its question-answering model, allowing it to provide more accurate and coherent responses to user queries. GPT-4’s standout improvement was its ability to understand and process different — multi-modal — data types beyond text and synthesize them into structured formats. While GPT-5 has not been released at the time of writing, the expected focus is on improving its reasoning and agentic capabilities, which would allow it to build more complex models and exhibit more human-like decision-making and behavior.

In a recent interview, Sam Altman clearly articulated why startups developing niche applications on top of a specific model version are at risk of being “steamrolled” by OpenAI’s rapid model advancements.

Startups that assume the underlying model will not significantly improve will quickly become obsolete. Roadmaps that don’t have a clear differentiation strategy from the model provider’s advancements are at risk of becoming inferior or redundant in the future. We look for entrepreneurs with a clear logic as to why their startups will remain competitive and who are excited to leverage the increasing capabilities of the underlying foundation models in the future. In fact, we often ask early founders if they are excited about successive releases of GPT (e.g. GPT-5). If so, that usually signals that they have a more defensible concept that more powerful GPT releases accelerate versus threaten. Those who are more concerned or don’t have a clear answer, are likely to be threatened by a new GPT or model release that expands in functionality to absorb their functionality.

Underestimating the Incumbents

While the “innovator’s dilemma” articulates why incumbents can do everything “right” yet still lose their market dominance as new competitors rise, the AI platform shift has put this thesis into question.

Unlike other industries, Big Tech incumbents are savvy. They have skilled workforces and systems that allow for rapid integration of new technologies. They also have advantages in distribution, computing, and data whether it’s because of existing capabilities or ample cash to build them. In realms where innovation is incremental, the advantage tilts in their direction.

As a new platform develops, startups shouldn’t discount an incumbent’s brand reputation, their deep industry insights and relationships, and regulatory or compliance experience. While we’ve seen some incumbent players being more open than normal to experiment with new technology, many of these initial efforts quickly fall flat. Companies need and want an AI strategy, but many don’t yet have a standard process to evaluate new vendors. They involve research, compliance, and IT, and oftentimes a Gen AI committee before making a decision. Startups shouldn’t overlook the challenges of gaining customer adoption, and they shouldn’t discount the trust built with established vendors even if their new product performs better.

We seek founders who acknowledge the strength of the incumbents. They understand what makes them strong and, equally, what makes them weak. We look for founders who can articulate this clearly and explain why their creative go-to-market takes advantage of Dormant Incumbents. Ideally, there’s strong differentiation at the UX level.

Mission Critical Tasks

While supporting customers with mission-critical tasks often represents the highest potential return on investment for AI companies in theory, they are very unlikely to deliver quick wins in practice.

Generative models are inherently probabilistic and stochastic. They generate outputs based on learned distributions and patterns in the input data rather than deterministic rules. Achieving near-complete accuracy with generative models is challenging due to the inherent uncertainty in the generated outputs. While these imperfections might be acceptable and desirable in certain contexts, Gen AI presents serious limitations in use cases where deviations from the ground truth could lead to negative consequences.

Trust is most important when a customer is evaluating vendors to support their mission-critical tasks. Startups must demonstrate reliability, security, and robustness, which can easily translate into long sales cycles and expensive regulatory compliance certifications. The go-to-market motion requires careful navigation to overcome these hurdles.

The significant time and capital investments needed to launch these solutions into the market cannot be underestimated. As companies scale, ongoing expenses such as acquiring high-quality data and procuring high-performance hardware will persist.

Instead of diving into mission-critical use cases from the start, we seek founders who prefer to initially focus on applications that don’t require near-perfect output accuracy; components where they can gain experience, build a track record, and gradually expand their capabilities. This allows them to mature their AI solutions and remain nimble as more sophisticated hybrid models come online.

Parting Thoughts

In every significant technological evolution in history, going back to the wheel, adaptations in the relationship between machines (both physical and logical) have been a key socio-economic element of the transition. Those adaptations have historically spanned the micro (how humans actually interact with those machines) as well as the macro (societal and economic impact at large). Further, incumbents have more often than not been temporarily, or permanently, disrupted, often generating in consequence new economic value, and new key players in those ecosystems.

The wave of innovation we are currently experiencing is arguably the most impactful (and likely the most accelerated) technological wave in modern history. In terms of ultimate impact, it is certainly on par with the Industrial Revolution and it represents a human-machine co-evolution like we’ve never experienced before.

In 2018 we coined a phrase at Bee Partners that has served as a reminder of how profound this era is, and it has served as a touchstone for us since: “Machines Will Win”. Frankly, they already have won in many aspects of human perception and performance. Our collective ability to make the right investment decisions during this unprecedented period will set the new standards and protocols for HMI, just as the prior waves we discussed introduced new standards and protocols — many represented by new entrants in the markets that accrued enormous value by virtue of innovating rapidly while incumbents “waited to see”.

We’ve discussed here this New Work Anatomythat not only embraces but necessitates the increased co-evolution and cooperation with machines that is inevitable given the advancements and acceleration of Gen AI and the underlying model landscape. New companies and incumbents who embrace this new reality will have the best chance of thriving. Those who cling to legacy tech, processes and standards, may be tactically “safe”, but strategically at risk.

We have outlined four broad Key Opportunity areas that we feel offer the best odds for inordinate value creation given the current dramatic backdrop of change. In particular, we have emphasized what we feel is the least acknowledged but potentially the most disruptive opportunity domain: the ultimate next-generation human-machine interfaces into GenAI.

We have similarly outlined four Key Riskareas that we have become attuned to as early-stage investors. This is partly due to our own comprehensive research but mostly due to observing the extensive AI deal flow we’ve seen over the past few years.

This combination of our own internal theses around Key Opportunities and Key Risks, with a sincere hat-tip to the lessons we can learn from the behavior and outcome of prior historical tech evolutions, of course, guide our investment decisions here. The final takeaway for readers here should be that the way we respond to the profound impact of Gen AI — where machines for the first time in human history represent collaborative vs. subservient entities — will have profound implications for human-machine interaction for decades to come. We firmly believe that the opportunities for positive societal and industrial evolution have never been greater.

Appendix: A Note on Founder Archetypes

Building successful Gen AI solutions requires a combination of technical prowess, business acumen, and strategic vision. The ideal founding team for a Gen AI startup typically encompasses a diverse mix of skills and backgrounds to tackle the multifaceted challenges presented by AI technology.

What We Don’t Want

Founders chasing “the current thing” without a vision of the future. While a clear vision is crucial, excessive rigidity can be a downfall. The AI field is rapidly evolving, and new developments might render previous approaches obsolete. Founders must be flexible enough to pivot and adapt their strategies in response to new information or changing market conditions.

Founders who are big thinkers and dreamers but don’t have a solid track record of speed to deployment. Shipping velocity matters more than ever.

Founders who aren’t mindful when it comes to application selection. Not all tasks can tolerate model hallucinations, and opting out of LLMs in those cases is acceptable.

While a founder doesn’t need to be the lead AI scientist, a basic understanding of AI technologies, particularly Gen AI, is crucial. A lack of technical grasp can lead to unrealistic goals, poor decision-making, and difficulty hiring the right talent or communicating effectively with technical teams.

AI startups face significant risks, from data privacy concerns to technology malfunctions. Founders who fail to proactively identify, assess, and mitigate these risks can find their companies facing crises that could have been avoided or significantly softened.

What We Look For

Vision for the future: strong beliefs loosely held. Founders who are prescriptive yet adaptable with their solutions.

Founders who are technical yet understand the value of a good growth and distribution engine.

Founders who have the ability to pivot quickly when necessary, learn from setbacks, and persist through the startup’s inevitable ups and downs.

Those with a commitment to developing and using AI in ways that are ethical and beneficial to society, recognizing the broader impact of their technology.

Startups often require cross-disciplinary collaboration, so team members must be able to work effectively across different fields and perspectives.

Clear, effective communication with internal teams, investors, regulatory bodies, and customers is crucial for aligning goals and driving adoption.

← Journal

Journal