Photoshop farewell? The new artificial intelligence of Google allows you to edit photos by question.

The output opens new possibilities

A real multimedia output opens the new interesting possibilities in Chatbots. For example, Gemini 2.0 Flash can play interactive graphic games or create stories with consistent illustrations, preserving personality and determining continuity through multiple images. It is far from perfection, but the consistency of the personality is a new ability in artificial intelligence assistants. We tried it and it was very wild – especially when I gave birth to a picture we present from another angle.

Create a multi -photo story with GEMINI 2.0, Part 2. Note the alternative corner of the original image.

Google / Bing Edwards

Create a multi -photo story with Gueini 2.0, Part 3.

Google / Bing Edwards

The presentation of the text represents another potential force for the model. Google claims that the internal standards show the performance of Gemini 2.0 Flash better than the “leading competitive models” when creating images containing a text, making them suitable for creating content with an integrated text. From our experience, the results were not exciting, but they were read.

An example of presenting the text in the resulting image with GEMINI 2.0.


Credit

Despite the shortcomings of Gueini 2.0 Flash so far, the appearance of real media outlets appears to be a prominent moment in the history of artificial intelligence due to what suggests whether technology is continuing to improve. If you imagine in the future, say 10 years from now, as it can perform enough sophisticated artificial intelligence model HoludicBut without repeating the issue.

Returning to reality, the “first days” to produce multimedia images remain, and Google recognizes this. I remember that Flash 2.0 aims to be a smaller model of artificial intelligence faster and cheaper in operation, so it has not absorbed the expansion of the Internet. All this information takes a large space in terms of the number of parameters, and more parameters mean more account. Instead, Google trained Gemini 2.0 Flash by feeding it a coordinated data set, which is also likely to include targeted artificial data. As a result, the model does not know everything optical about the world, and Google itself says that training data is “wide, not public, complete or complete.”

This is just a fictional way to say that the quality of the image output is not perfect – yes. But there is a great room for future improvement to integrate more visual “knowledge” with the progress of training techniques and cost calculation. If the process becomes anything as we have seen with a spread -based AI’s photo generators such as stable proliferation, Midjourney, and flow, then the quality of the multimedia image output may quickly improve over a short period of time. Take the reality of the liquid media completely.

Leave a Comment