Google recently announced that the Gemini Assistant in Android Studio has upgraded to support multimodal input functions, and this innovative move brings developers a new development experience. Now developers can attach images directly to prompts, thus gaining visual assistance during application development. The introduction of this function marks another important breakthrough in the development tool in the direction of intelligence.
This multimodal feature was originally unveiled at the I/O 2024 conference, and the upgraded Gemini is now able to "understand simple wireframes and convert them into available Jetpack Compose codes." In the Canary version of Android Studio Narwal, the Ask Gemini field has a new "Attached Image File" (supports JPEG or PNG format) option. Google recommends users to use images with "strong color contrast" and provide "clear tips" for best results.
Developers can upload screenshots and user interfaces from simple wireframes to high-fidelity models and can specify the expected features. For example, in a calculator design example, it may be required to "make interaction and calculation work as expected". This flexibility makes Gemini a right-hand assistant for developers, greatly improving development efficiency.
Typical tips for converting visual design into functional UI code include: 1. "For this image provided, write Android Jetpack Compose code to create a screen as close to this image as possible. Make sure to include imports, use Material3, and record the code. "2. "For this image provided, write Android Jetpack Compose code to create a screen as close to this image as possible, and get creative in color. Make interaction and calculations work as expected. Make sure to include imports, use Material3, and record the code." These tips provide clear guidance for developers to quickly implement design-to-code conversions.
Google positioned Gemini as a tool that provides a "initial design framework", and the generated code often requires further editing and tweaking. Common improvements include ensuring that drawable objects and icons are imported correctly. Google recommends treating generated code as an efficient starting point to speed up the UI development workflow. This positioning makes Gemini not only a tool, but also an optimizer for development processes.
In addition, Gemini's visual analysis feature can also be used to identify and resolve errors, and developers can "upload screenshots of the problematic UI, which Gemini will analyze the image and propose a potential solution". Developers can also attach relevant code snippets for more precise help. This function provides developers with powerful problem-solving capabilities, making the development process smoother.
Gemini in Android Studio also supports uploading architecture diagrams and obtaining explanations or documents, similar to the Gemini Astra glasses feature previously displayed at the I/O conference. The introduction of this function has made Gemini's position in development tools more important and provided developers with all-round support.