Generating Consistent Faces in Stable Diffusion: Techniques and Methods

26.01.2024

In the realm of AI-generated art, Stable Diffusion stands as a powerful tool for creating hyper-realistic images. A key challenge, however, is generating consistent faces, especially when seeking a balance between recognizability and originality. This article explores methods to achieve this balance, focusing on the use of celebrity names as a base for face generation.

Base Prompt and Its Impact

Our base prompt is designed to generate a generic face:

"photo of young woman, highlight hair, sitting outside restaurant, wearing dress, rim lighting, studio lighting, looking at the camera, dslr, ultra quality, sharp focus, tack sharp, dof, film grain, Fujifilm XT3, crystal clear, 8K UHD, highly detailed glossy eyes, high detailed skin, skin pores."

To maintain consistency in image quality, we use the following negative prompt throughout this article:

"disfigured, ugly, bad, immature, cartoon, anime, 3d, painting, b&w."

The Power of Celebrity Names

Introducing a celebrity name into the prompt has a profound impact. For instance, adding "Emma Watson" to the base prompt instantly aligns the generated face with her recognizable features. This demonstrates the strong influence of celebrity names in Stable Diffusion.

Creating a Generic Yet Consistent Face

But what if the goal is a generic face? The key is to blend multiple celebrity faces. By using a combination of names, Stable Diffusion synthesizes a unique, consistent face. For our experiment, we blend the features of Melanie Laurent, Brittany Murphy, Elizabeth Turner, Ellana Bryan, and Natalia Dyer, with specific emphasis on the eyes and nose of Olivia Newton-John.

The prompt becomes:

"she's a mix of (Melanie Laurent:1.3), (Brittany Murphy:1.9), (Elizabeth Turner:1.8), (Ellana Bryan:1.4) and (Natalia Dyer:1.9), with (the eyes and nose of Olivia Newton-John:1.9), photo of young woman, highlight hair, sitting outside restaurant, wearing dress, rim lighting, studio lighting, looking at the camera, dslr, ultra quality, sharp focus, tack sharp, dof, film grain, Fujifilm XT3, crystal clear, 8K UHD, highly detailed glossy eyes, high detailed skin, skin pores."

This adjusted prompt yields a face that consistently appears across different images, yet doesn't strongly resemble any one of the mentioned celebrities.

Fine-Tuning with Keyword Weights

The balance of features in the generated face can be finely adjusted using keyword weights. In our example, the weight assigned to each celebrity's name dictates their influence on the final image. This technique allows for nuanced control over the facial features, ensuring a consistent yet unique result.

Conclusion

Generating consistent faces in Stable Diffusion involves a delicate balance of specificity and generality. By strategically blending celebrity features and adjusting their influence through keyword weights, artists can create unique faces that maintain consistency across various images. This method opens up new possibilities in the field of AI-generated art, allowing for greater control and precision in visual storytelling.

Download_on_the_App_Store_Badge_US-UK_RGB_blk_4SVG_092917 Telegram_logo