In the previous post, I discussed AI’s role in new mobility. I identified the three roles AI can play as an enabler, a differentiator, or a monetizer. The recent explosion of interest in generative AI begs the question: could generative AI contribute to new mobility and if so, in which of the three roles? This post attempts to answer this question by presenting a few ideas and identifying problems that may inhibit the broad use of generative AI in new mobility.
Over the past few weeks, several of our firm’s corporate partners have asked about generative AI and its implications on their business operations. Several of them have started experimenting with ChatGPT. Generative AI is the part of AI that uses algorithms to create new content. Such content may be in the form of text, images, voice, video, or even computer code. As a field, it was created about sixty years ago and different systems with generative behavior were presented since then. IBM’s Watson system which was introduced around 2011 was one such system. But it wasn’t until generative adversarial networks and transformers were introduced, and more recently Foundation Models and Large Language Models (LLMs), that interest in the field rekindled. Because of the way they are structured and how they use training data they use, generative AI systems have two properties. They exhibit “emergent” behavior that can be considered novel, and they can generalize across domains. We are in the very early stages of understanding the impact of these behaviors in enterprise and consumer applications.
In the previous post, I stated that in new mobility AI has a role as an enabler, a differentiator, or a monetizer. New mobility is about the intelligent movement of people and goods. It emphasizes the use of multimodal transportation as it seeks to achieve safe, affordable, convenient, and environment-friendly transportation for a population. New mobility has four basic components: smart vehicles that may be automated or autonomous, intelligent digital platforms that are used in the provision of mobility-related services, intelligent transportation infrastructures, and novel business models. The question is whether generative AI will contribute to all three roles.
A business colleague told me about his experimentation with ChatGPT and CoPilot. Using a series of unambiguous queries, (the overall quality of the queries directly correlates to the quality of the responses generative AI systems produce), he collaborated with these systems on a customer segmentation problem. The interaction resulted in software snippets which were eventually organized into a running program, complete with documentation. By my colleague’s admission. The interactive investigation and analysis that resulted from the human-machine collaboration saved him significant time, enabled him to consider alternatives that he wouldn’t otherwise evaluate, and solve the problem on his own instead of engaging a team. At a time when the automotive and transportation industries need software and AI engineers, this experiment demonstrated that generative AI tools such as ChatGPT and CoPilot can provide meaningful assistance for expediting the implementation, adoption, and monetization of new mobility, as well as training, or retraining, software engineers.
Eager to explore whether ChatGPT can be used in some aspect of monetizing new mobility, a few days later, I decided to use it to plan a familiar journey. I created the following query: On Tuesday I want to travel from Palo Alto to the Kabuki Hotel in San Francisco. I can leave by 7:30 am and need to arrive no later than 8:30 am. I can use both public transportation and ride-hailing but cannot spend more than $10 to get to my destinations. What are my options? An automaker may want to offer such a service for a fee in response to detecting heavy congestion on a city’s freeways. Subscription-based journey planners such as Transit, or Citymapper can easily address such problems. A good journey planner will immediately show that this problem is unsolvable unless I can relax either the time or budget constraints. But through a series of queries, ChatGPT ultimately provided me with three options. They were all invalid based on the timing constraints I had established, but the modality combinations used in these options were correct and consistent with the options generated by the Transit and Citymapper applications. As part of the options it generated, ChatGPT also brought up the metric of total time spent in the public transportation system compared to the time spent in the ride-hailing vehicle. That’s something I had not considered as a criterion for selecting among options. In other words, the trip’s total price, the overall trip duration, and the time spent in each modality used should be the metrics considered when selecting the best of the available options.
These albeit simple experiments brought up five issues that we must address if generative AI is to play any role in new mobility.
First, accounting for constraints and dealing with stale data. ChatGPT didn’t understand that the time and cost constraints I initially placed could not be satisfied given the public transportation schedules, duration of the trips, and the cost of each trip segment it was proposing. It was also using outdated public transportation schedules. In some cases, data freshness may be addressed with model retraining. It is not yet clear how frequently Foundation Models, including Large Language Models (LLMs), should be retrained to incorporate updated information, and what should trigger their retraining. Given the size of these models and the resources required to train them, each retraining operation is expensive. Another approach is to provide such systems with real-time data to be used during inference. For example, real-time data from public transportation systems, traffic infrastructure sensors, etc. We will need to determine how much this type of access will impact the LLM’s responses and under what conditions access to such data will necessitate model retraining.
Second, checking the validity of the returned results. Factual correctness has been recognized as a problem associated with Foundation Models. In my small experiment, I was able to check the validity of ChatGPT’s options by using my personal knowledge and the Transit and Citymapper applications. My business colleague used his personal knowledge to validate the system-generated code. In many cases, this may be impossible or too expensive.
Third, generative AI systems becoming confused. In a few instances during these two experiments, ChatGPT became confused about what the user was asking it to do. Could ChatGPT do better if it had access to a domain-specific model or even a knowledge base such as Wolfram Alpha? How to integrate such a model or knowledge base into an LLM without breaking the LLM’s functionality and creating other types of erroneous responses is something that still needs to be researched.
Fourth, incorporating a model for the cost of erroneous decisions. Today generative AI systems don’t have such a model. It is one thing to recommend a multimodal trip that cannot be accomplished and another to make a wrong medical diagnosis or generate the optimization of a process with prohibitive implementation costs. Incorporating such cost models (that will also take into consideration the liability involved) will require a clear understanding of how generative AI systems arrive at a decision. Today we don’t quite understand (and here) this process. The enormity of Foundation Models and the opaqueness associated with neural network-based systems makes this a particularly hard issue to address.
Fifth, incorporating context about individuals. To function correctly as a new mobility monetizer, a generative AI system will need to have data about the traveler. Such data will need to include both a description of the traveler’s current situation, what I provided in my journey-planning statement, but also previously captured data that enables the understanding of traveler preferences and other characteristics. Perhaps LLMs can be used to summarize previously-captured customer data and use such summaries in combination with the description of a new situation but this is still an untested hypothesis. The application of generative AI in medical diagnosis will have the same requirement. Understanding a patient’s history, in addition to a description of the problem to be solved, will be necessary. Accessing such data will bring up privacy and other issues, including issues that impact the functioning of such systems.
Where does all this leave us regarding the application of generative AI to new mobility? We can look at applications such as the summarization of customer data, the creation of various types of synthetic data, optimizing operations, or the training or retraining of employees falling under the category of generative AI as an enabler to new mobility. Using voice-to-voice chatbots to completely rethink the in-cabin human-machine interaction, other types of chatbots to be used in conjunction with various employee-facing applications, summarizing focus group and customer feedback to optimize offerings, delivery time and/or prices, applications for creating initial component or entire vehicle designs, creating new software code, or translating existing code of varying complexity from one computer language to another that fall under the category of generative AI as a differentiator. Imagine, for example, the time when a vehicle designer collaborates with a generative AI system like DALL-E or Stable Diffusion that translates text to images to rapidly create new design ideas to improve a vehicle’s characteristics. For example, designs that reduce the Toyota Prius’ drag coefficient from 0.25 that it is today to 0.23 (Tesla’s Model 3 drag coefficient), or even lower. And then automatically generate the code for Catia or Siemens NX to enable the designer to put the final touches to the design and automatically simulate its performance. Automating such a loop, significantly shortening the time it takes to perform each iteration will radically change the vehicle design process. In addition to trip planning services, one can imagine using generative AI systems in customer-facing applications, and in other services that enhance the mobility-related customer experience many of which can be monetizable.
The generative AI systems being introduced sometimes impress us, others disappoint us, but more importantly, they cause us to think about our role in the various processes and activities in which we are involved. Are these the right technologies and architectures to use? Do we need to combine open-source with proprietary models to achieve the desired results? If so, how and at what cost? In business in general and new mobility, in particular, while we don’t know which applications could ultimately be implemented using generative AI systems, we can envision them reducing costs, increasing monetization, and improving productivity. But we need to approach them carefully. We must address the problems I outlined and many others that I missed or will surface. Seeing these systems as assistants and collaborators will be great. Seeing them as outright substitutes will lead to problems and conundrums that we are not yet ready to address.
Given the potential power of ChatGPT/LLMs/Generative AI, and its weaknesses, another need is for “prompt engineering” — the way that humans work with the system, by knowing what it can do and can’t do and how it does what it does, the user can submit better or less useful queries.
This week a friend sent me a document with suggested prompts for getting the most out of ChatGPT, and I listened to an a16z podcast on prompt engineering.
Will this lead to the development of ChatGPT/LLM/Generative AI interface tools that do the best framing of queries?
Someone needs to start collecting tips and tricks for the best mobility queries, similar to what you have described here, and build them into another AI layer to interface with the Generative AI. Or will it be built into ChatGPT and the other tools?
Here is the link to the a16z prompt engineering podcast and description:
https://a16z.simplecast.com/episodes/unlocking-creativity-with-prompt-engineering-dxnNWNDY
Unlocking Creativity with Prompt Engineering
MARCH 9TH, 2023 | 38:31 | E708
With every new technology, some jobs are lost while others are gained. People often focus on the former, but in this episode we chose to highlight the latter – a highly creative role that emerges alongside AI: the prompt engineer.
Until AI can close the loop of its own, each tool still requires a set of prompts. Just like a composer feeds an instrument the notes to play, a prompt engineer feeds an AI a map of what to produce. And if we know anything from music it’s that composing great music takes great skill!
In this episode we explore the emerging importance of prompting with Guy Parsons, the early learnings of how to do it effectively, and where this field might be going.
Will the prompt engineer be more like the highly sought after DevOps engineer, or a proficiency like Excel that you find on every resume? Listen in to hear Guy’s take.
EPISODE NOTES
With every new technology, some jobs are lost while others are gained. People often focus on the former, but in this episode we chose to highlight the latter – a highly creative role that emerges alongside AI: the prompt engineer.
Until AI can close the loop of its own, each tool still requires a set of prompts. Just like a composer feeds an instrument the notes to play, a prompt engineer feeds an AI a map of what to produce. And if we know anything from music it’s that composing great music takes great skill!
In this episode we explore the emerging importance of prompting with Guy Parsons, the early learnings of how to do it effectively, and where this field might be going.
Will the prompt engineer be more like the highly sought after DevOps engineer, or a proficiency like Excel that you find on every resume? Listen in to hear Guy’s take.
Interested in the prompt competition? Email us at podpitches@a16z.com.
Resources:
DALL-E 2 Prompt Book: https://dallery.gallery/the-dalle-2-prompt-book/
Find Guy on Twitter: https://twitter.com/GuyP
Guy’s combining image experiment: https://twitter.com/GuyP/status/1612880405207580672
Guy’s amorphous prompt experiment: https://twitter.com/GuyP/status/1608475973300948993
Guy’s space duck: https://twitter.com/GuyP/status/1601342688225525761
Prompt base: https://promptbase.com/
Lexica: https://lexica.art/
Topics Covered:
0:00 – Introduction
01:49 – DALL-E 2 Prompt Book
05:29 – Parallel skills
06:51 – 80/20 prompting
10:16 – New ways of prompting
13:44 – Pulling the AI slot machine
18:09 – Comparing models
21:04 – Requested features
26:34 – Learning with AI
27:58 – Practical use cases
32:08 – A top 1% prompt engineer
36:17 – The most popular images
There is a lot of experimentation happening right now with models such as ChatGPT, DALL-E and others. Some of this will lead to great new research results and new applications of #generativeAI. However, the majority of experimentation today is due to the novelty of ChatGPT. There are hundreds (maybe even thousands) of YouTube videos that relate to ways to use ChatGPT to accomplish certain tasks (and make money in the process).
It is very smart of OpenAI to enable plugins. That will help build an ecosystem in the same way that extensions were built for Chrome.
I expect that we will see several #startups that provide intelligent interfaces to GPT-X that play the role of the prompt engineer in a specific domain, not unlike what we had seen several years ago with natural language interfaces to relational databases.