top of page

The Technology Behind You Need to Know in 2023

Bitnine Global Marketing team

Fri Jul 05 2024

The Technology Behind You Need to Know in 2023

Lately, we've been hearing a lot about 'Chat GPT,' 'AI-generated images,' and even 'AIs taking over our jobs!'

Why is this technology suddenly getting so much attention? It's because of its immense impact on computing. This article will unravel the technology behind AI-generated images, offering insights into the advancements, techniques, and applications that make them a crucial part of our digital future.


How Do AIs Generate Images?

The Stable Diffusion Model is a generative AI model that creates images based on text input. It can generate detailed images according to the text provided and even recreate specific parts of existing images. Let's break it down.

Figure 1. Extracting Objects (Noun) and Relationships (Predicates) from imgs and structuring them into a Scene Graph (Source: Scene Graph Generation by Interactive Message Passing)


What is Scene Graph?

A Scene Graph detects objects in an image or video and infers the relationships between them, forming a graph. For example, in an image of a mountain and a horse, the mountain and horse are extracted as nodes, and their relationship (e.g., "behind") forms the edges of the graph. This method visually expresses complex relationships, making inferences within images clearer.


Turning Text into Images

Several methods convert text prompts into Scene Graphs, such as 'Frido,' 'SceneGenie,' 'SGDiff,' 'SG2Im,' and 'diffuscene.' Among these, SceneGenie uses the Stable Diffusion Model and is known for its high accuracy. Here’s how it works:


Stable Diffusion Model Process

Figure 2. Noising and Denoising, processing the latent information on the CLIP model using Encoder and Decoder (img source)


The Stable Diffusion Model uses OpenAI's 'Contrastive Language-Image' Pre-training (CLIP) model, which maps text to corresponding image information. This process involves noising and denoising, where noise is added and then removed from an image to help the AI learn and improve its capabilities.


SceneGenie - Stable Diffusion Using Scene Graph

Figure 3. Architecture of SceneGenie (Source: Scene Graph Guided Diffusion Models for img Synthesis)


SceneGenie enhances the Stable Diffusion Model by predicting approximate locations for images created using Scene Graphs. This approach improves accuracy, especially for complex prompts involving multiple objects and actions.



Why Use Scene Graphs?

Scene Graphs enable more accurate image generation by decoding relationships among complex objects. This technology addresses the limitations of existing AI models, which struggle with complex text prompts and multiple objects.

Figure 4. A text prompt that cannot distinguish objects (left) and a scene graph that can distinguish between objects and relationships (right)

Figure 5. Comparison of img generated by DALLE.E 2 (left) and SceneGenie (right)

(Source: SceneGenie: Scene Graph Guided Diffusion Models for img Synthesis)



Conclusion

In the example above, you can see that SceneGenie, which uses Scene Graph, generates better images for complex objects and relationships. In the image generated by DALL.E 2, the word 'sheep' was recognized as one object. However, Scene Graph classified 'sheep' mentioned twice in the text prompt as two different objects. Additionally, other objects like 'Boat' are more accurately positioned as requested.


In essence, using Scene Graphs for image generation creates more accurate images by decoding the relationships among complex objects. Existing image-generating AI often struggles to generate the desired image with complex text prompts, especially when multiple objects are present.


A Scene Graph expresses relationships between objects as a graph, making it easier to infer the situation within an image or text. As generative AI continues to rise, adopting Scene Graphs is becoming increasingly essential for achieving accurate and meaningful synthesis.


Does Scene Graph and Graph Technology still feel too advanced?

Don't worry! Bitnine Global handles the graph technology plug-in for your existing data model.


Contact us at marketing@bitnineglobal.com or visit our website!

Toronto
1120 Finch Ave W, Suite 702, Toronto, ON M3J 3H7


Vancouver
885 Dunsmuir, Suite 588, Vancouver, BC V6C 1N5

Contact us

Send to: info@bitnineglobal.com

  • LinkedIn
  • �중간 규모
  • X

© 2024 Bitnine Holdings, Inc. All rights reserved.

 

Apache AGE™ and the Apache AGE™ logo are trademarks of the Apache Software Foundation.

All other trademarks are the property of their respective owners.

bottom of page