AI ART generation is taking new turns weekly -- and it hurts my brain trying to grasp things like how to use AI to upload an image and have Blender generate a 3D scene (which I haven't figured out yet).
The big push now is to casually talk with AI and get it to generate images, then just continue the conversation to alter those images as well -- getting away from more rigid "prompt engineering" mentality, when you think of Midjourney.
Generating Images
All the recent rave was about how ChatGPT now creates images now just by chatting with it (as of March 25th, 2025) -- even with the free option (I had to log in however):
I asked for "an image of a javelina running in the desert," and ChatGPT generated a pretty realistic image of a javelina:
"Can you create an image of
a javelina running in the desert"
You can even change the image by further chatting with ChatGPT -- though it gave me some flack, delaying my request a couple of times probably because I was using the free program:
"Can you change that image into a print by Durer"
"I'm still on a cooldown for image generation and need to wait about five more minutes before I can create the Dürer-style print. Let me know if you'd like to try again later!"
"cooldown...try again later"
Eventually ChatGPT did transform my javelina image into a decent print image in the style of Dürer:
"Can you change that image
into a print by Durer"
ChatGPT did generate decent images, but it really throttles that service for free users.
Image Generator
REVE is a new AI image generator (free at the moment), and also made pretty realistic images using the text prompt of: "a javelina running in the desert":
Four AI generations of
"a javelina running in the desert"
The best of the four REVE images
HOWEVER it did not do a great job when I uploaded my drawing, as REVE would not even take a guiding prompt with the image:
Not a very faithful
AI interpretation of my drawing
REVE generated four unfaithful
variations of my drawing
Generating Images
- Gemini 2.0 Flash (Image Generation) Experimental (generates raster images)
- Gemini 2.5 Pro Experimental 03-25 (only generates vector images)
RASTER IMAGES
JPG
The Google AI Studio
with the Gemini 2.0 model
Gemini 2.0 generated
not the best javelina
"As a large language model, I cannot directly generate or output binary image files like JPGS."
"I cannot directly generate ... image files like JPGS."
VECTOR IMAGES
HOWEVER the advanced Gemini 2.5 model WOULD generate vector files, specifically of the roadrunner I uploaded:
After generating a roadrunner in AI, and converting it into a 3D model to be seen in Augmented Reality...
I uploaded my AI image of a roadrunner,
and Gemini 2.5 generated a vector image of it
download: roadrunner.SVG (5.16 KB)
3D IMAGES
Gemini 2.5 generated code
to make a headless javelina in OPENSCAD
Last December ChatGPT also generated OPENSCAD code, and even gave the javelina a head:
DESCRIPTION
of my STYLE
EVEN more impressive was how Gemini 2.5 described one of my drawings after I uploaded it:
I uploaded my drawing to
The drawing I uploaded
"Based on the visual characteristics, this drawing style can be described using several terms:
- Sketchy / Sketch: It's clearly not a finished, polished piece but rather a quick study or exploration of form.
- Linear: The style relies heavily on lines rather than shading or tone to define form.
- Abstract / Abstracted: The figure is not rendered realistically. Forms are simplified, fragmented, and stylized.
- Analytical / Structural: The drawing seems to break down the subject (a figure, perhaps?) into component parts or planes, suggesting an analysis of its structure rather than just its appearance. The blocky shapes reinforce this.
- Cubist Influence: The fragmentation of form, the use of geometric shapes (especially the blocky elements intersecting the figure), and the way different viewpoints might be implied are strongly reminiscent of preparatory sketches from the Cubist movement (think Picasso or Braque sketches).
- Combining these, you could call it an analytical abstract sketch with Cubist influences. It prioritizes understanding structure and form through fragmented, linear marks over realistic representation."
This was the best AI description yet of my drawing style.
NOTE: The Gemini 2.0 model was pretty far off earlier this month when I asked "Can you tell me what kind of style that KRRRL uses?"
No comments:
Post a Comment
Note: only a member of this blog may post a comment.