Tumgik
#styledrop
styledrop · 11 months
Text
Tumblr media
Burna Boy In Balmajn
18 notes · View notes
jcmarchi · 3 months
Text
StyleDrop: Text-to-image generation in any style
New Post has been published on https://thedigitalinsider.com/styledrop-text-to-image-generation-in-any-style/
StyleDrop: Text-to-image generation in any style
Posted by Kihyuk Sohn and Dilip Krishnan, Research Scientists, Google Research
Text-to-image models trained on large volumes of image-text pairs have enabled the creation of rich and diverse images encompassing many genres and themes. Moreover, popular styles such as “anime” or “steampunk”, when added to the input text prompt, may translate to specific visual outputs. While many efforts have been put into prompt engineering, a wide range of styles are simply hard to describe in text form due to the nuances of color schemes, illumination, and other characteristics. As an example, “watercolor painting” may refer to various styles, and using a text prompt that simply says “watercolor painting style” may either result in one specific style or an unpredictable mix of several.
When we refer to “watercolor painting style,” which do we mean? Instead of specifying the style in natural language, StyleDrop allows the generation of images that are consistent in style by referring to a style reference image*.
In this blog we introduce “StyleDrop: Text-to-Image Generation in Any Style”, a tool that allows a significantly higher level of stylized text-to-image synthesis. Instead of seeking text prompts to describe the style, StyleDrop uses one or more style reference images that describe the style for text-to-image generation. By doing so, StyleDrop enables the generation of images in a style consistent with the reference, while effectively circumventing the burden of text prompt engineering. This is done by efficiently fine-tuning the pre-trained text-to-image generation models via adapter tuning on a few style reference images. Moreover, by iteratively fine-tuning the StyleDrop on a set of images it generated, it achieves the style-consistent image generation from text prompts.
Method overview
StyleDrop is a text-to-image generation model that allows generation of images whose visual styles are consistent with the user-provided style reference images. This is achieved by a couple of iterations of parameter-efficient fine-tuning of pre-trained text-to-image generation models. Specifically, we build StyleDrop on Muse, a text-to-image generative vision transformer.
Muse: text-to-image generative vision transformer
Muse is a state-of-the-art text-to-image generation model based on the masked generative image transformer (MaskGIT). Unlike diffusion models, such as Imagen or Stable Diffusion, Muse represents an image as a sequence of discrete tokens and models their distribution using a transformer architecture. Compared to diffusion models, Muse is known to be faster while achieving competitive generation quality.
Parameter-efficient adapter tuning
StyleDrop is built by fine-tuning the pre-trained Muse model on a few style reference images and their corresponding text prompts. There have been many works on parameter-efficient fine-tuning of transformers, including prompt tuning and Low-Rank Adaptation (LoRA) of large language models. Among those, we opt for adapter tuning, which is shown to be effective at fine-tuning a large transformer network for language and image generation tasks in a parameter-efficient manner. For example, it introduces less than one million trainable parameters to fine-tune a Muse model of 3B parameters, and it requires only 1000 training steps to converge.
Parameter-efficient adapter tuning of Muse.
Iterative training with feedback
While StyleDrop is effective at learning styles from a few style reference images, it is still challenging to learn from a single style reference image. This is because the model may not effectively disentangle the content (i.e., what is in the image) and the style (i.e., how it is being presented), leading to reduced text controllability in generation. For example, as shown below in Step 1 and 2, a generated image of a chihuahua from StyleDrop trained from a single style reference image shows a leakage of content (i.e., the house) from the style reference image. Furthermore, a generated image of a temple looks too similar to the house in the reference image (concept collapse).
We address this issue by training a new StyleDrop model on a subset of synthetic images, chosen by the user or by image-text alignment models (e.g., CLIP), whose images are generated by the first round of the StyleDrop model trained on a single image. By training on multiple synthetic image-text aligned images, the model can easily disentangle the style from the content, thus achieving improved image-text alignment.
Iterative training with feedback*. The first round of StyleDrop may result in reduced text controllability, such as a content leakage or concept collapse, due to the difficulty of content-style disentanglement. Iterative training using synthetic images, generated by the previous rounds of StyleDrop models and chosen by human or image-text alignment models, improves the text adherence of stylized text-to-image generation.
Experiments
StyleDrop gallery
We show the effectiveness of StyleDrop by running experiments on 24 distinct style reference images. As shown below, the images generated by StyleDrop are highly consistent in style with each other and with the style reference image, while depicting various contexts, such as a baby penguin, banana, piano, etc. Moreover, the model can render alphabet images with a consistent style.
Stylized text-to-image generation. Style reference images* are on the left inside the yellow box. Text prompts used are: First row: a baby penguin, a banana, a bench. Second row: a butterfly, an F1 race car, a Christmas tree. Third row: a coffee maker, a hat, a moose. Fourth row: a robot, a towel, a wood cabin.
Stylized visual character generation. Style reference images* are on the left inside the yellow box. Text prompts used are: (first row) letter ‘A’, letter ‘B’, letter ‘C’, (second row) letter ‘E’, letter ‘F’, letter ‘G’.
Generating images of my object in my style
Below we show generated images by sampling from two personalized generation distributions, one for an object and another for the style.
Images at the top in the blue border are object reference images from the DreamBooth dataset (teapot, vase, dog and cat), and the image on the left at the bottom in the red border is the style reference image*. Images in the purple border (i.e. the four lower right images) are generated from the style image of the specific object.
Quantitative results
For the quantitative evaluation, we synthesize images from a subset of Parti prompts and measure the image-to-image CLIP score for style consistency and image-to-text CLIP score for text consistency. We study non–fine-tuned models of Muse and Imagen. Among fine-tuned models, we make a comparison to DreamBooth on Imagen, state-of-the-art personalized text-to-image method for subjects. We show two versions of StyleDrop, one trained from a single style reference image, and another, “StyleDrop (HF)”, that is trained iteratively using synthetic images with human feedback as described above. As shown below, StyleDrop (HF) shows significantly improved style consistency score over its non–fine-tuned counterpart (0.694 vs. 0.556), as well as DreamBooth on Imagen (0.694 vs. 0.644). We observe an improved text consistency score with StyleDrop (HF) over StyleDrop (0.322 vs. 0.313). In addition, in a human preference study between DreamBooth on Imagen and StyleDrop on Muse, we found that 86% of the human raters preferred StyleDrop on Muse over DreamBooth on Imagen in terms of consistency to the style reference image.
Conclusion
StyleDrop achieves style consistency at text-to-image generation using a few style reference images. Google’s AI Principles guided our development of Style Drop, and we urge the responsible use of the technology. StyleDrop was adapted to create a custom style model in Vertex AI, and we believe it could be a helpful tool for art directors and graphic designers — who might want to brainstorm or prototype visual assets in their own styles, to improve their productivity and boost their creativity — or businesses that want to generate new media assets that reflect a particular brand. As with other generative AI capabilities, we recommend that practitioners ensure they align with copyrights of any media assets they use. More results are found on our project website and YouTube video.
Acknowledgements
This research was conducted by Kihyuk Sohn, Nataniel Ruiz, Kimin Lee, Daniel Castro Chin, Irina Blok, Huiwen Chang, Jarred Barber, Lu Jiang, Glenn Entis, Yuanzhen Li, Yuan Hao, Irfan Essa, Michael Rubinstein, and Dilip Krishnan. We thank owners of images used in our experiments (links for attribution) for sharing their valuable assets.
*See image sources ↩
1 note · View note
r0n1e · 10 months
Text
O Google StyleDrop gera imagens a partir de texto
http://dlvr.it/SqhW57
0 notes
guidady · 10 months
Link
StyleDrop: Text-To-Image Generation in Any Style
0 notes
hackernewsrobot · 10 months
Text
StyleDrop: Text-to-Image Generation in Any Style
https://styledrop.github.io/
0 notes
Text
Livi Necklace
Tumblr media
Livi is the new Style Drop! $15 with a $40 purchase! 😍💗💗💗
1 note · View note
thegspotboutique · 4 years
Photo
Tumblr media
SLINKY, SHINEY & SEXY in this Gorgeous Vintage Very Red Asian Silk Dress... @erin_cwiertniewicz Stunning! #slinky #shiney #sexy #asianclothes #asianfashion #vintagedresses #vintagesilkdress #fashionistas #styleblog #tastemakers #styledrop #vintageboutique #thriftyvintage #thriftfinds #silk #downtownwestchesterpa #wcshoplocal #wcshopsmall #chestercountyshopping #thriftboutique #alternativestyle #thegspotthrift #thegspot #modernretrostyle #modernretroboutique #ladyinred (at G Spot Thrift Boutique) https://www.instagram.com/p/B38qIn1DrLP/?igshid=18rzbkolk2t8u
0 notes
woventrends · 5 years
Photo
Tumblr media
💧💧 Sauce dripping all day. Grab the look on @woventrends #linkinbio #saucedrip #minidresses.💧💧 . . . . . #novabae #boohoo #styledrop #saucedrippin #shopthelook #shopthestyle #shopthislook #shopthispost #fashionpost #salespost #positivevibes #positivity #positivethoughts #positiveaffirmations #keepinghope #curvygirlsrock #curvygirls #grabthelook #getthelookinstore #getthestyle #getthestylist https://www.instagram.com/p/B1Tdhv6hvvt/?igshid=6cri76kk1wxe
0 notes
raveroom · 2 years
Photo
Tumblr media
Madness…. #madguitarist #redroom #whitedoor #redmusic #shadowman #prints #styledrops #backdrop #jmzy (at Bartlett, Illinois) https://www.instagram.com/p/CYQzEOOuM1_/?utm_medium=tumblr
0 notes
Photo
Tumblr media
@Regranned from @styledrop - #MandyMoore in @rosie_assoulin #styledrop #fashion #goldenglobes #blog #goldenglobes2018 - #regrann
0 notes
styledrop · 11 months
Text
Tumblr media
14 notes · View notes
styledbyalykay · 7 years
Photo
Tumblr media
And one more! The Evie lace knit tee (pictured here in black) is another one of our new tops you can find at the link in my bio 👆. Shop these looks now before they're gone!! 🛍🏃 • • • #stelladotstyle #stelladot #sdjoy #styledrop #style #stylist #styleguide #styleblogger #styleinspiration #styleinspo #whattowear #clothing #newarrivals #brandnew #getthelook #dontmissout #limitedstock
0 notes
hollywoodtapfl · 7 years
Photo
Tumblr media
Credit to @styledrop : #views #blog #styledrop #beach #hollywoodtapfl #hollywoodfl #hollywoodflorida #hollywoodbeach #downtownhollywood #miami #fortlauderdale #ftlauderdale #aventura #dania #daniabeach #hallandale #hallandalebeach #davie #pembrokepines #miramar @hollywoodtapfl (at The Diplomat Beach Resort)
0 notes
polkadotpopp · 7 years
Photo
Tumblr media Tumblr media
Who: Princess Beatrice
What: Karl Lagerfeld Selena Dress
Where: Reception at 10 Downing Street for The London Evening Standard “Get London reading” campaign | 1st July 2014
Worn with Miss KG Kurt Geiger Alba Sandals in Black & Rolex Watch
Photo Credits: Nigel Howard/Evening Standard/hollandse-hoogte & Styledrops
6 notes · View notes
jayvoicetrg · 4 years
Text
Bill Cosby appeal: Camille Cosby breaks silence, says racism at root of husband's incarceration
Bill Cosby appeal: Camille Cosby breaks silence, says racism at root of husband’s incarceration
Wednesday, June 24, 2020 5:46PM
Ad Duration00:05 –
Play
PlayCurrent time00:09Seek 00:00
Duration02:25Toggle MuteVolume
Caption Style
FontProportional sans serif
Font ColorWhite
Font Size100%
Font OpacityOpaque
Font StyleDrop Shadow
Background ColorBlack
Background OpacityTransparent
Toggle Fullscreen EMBED <>MORE VIDEOS 
In her first major interview in six years, Camille…
View On WordPress
0 notes
thegspotboutique · 4 years
Photo
Tumblr media
SLINKY, SHINEY & SEXY in this Gorgeous Vintage Very Red Asian Silk Dress... Stunning! #slinky #shiney #sexy #asianclothes #asianfashion #vintagedresses #vintagesilkdress #fashionistas #styleblog #tastemakers #styledrop #vintageboutique #thriftyvintage #thriftfinds #silk #downtownwestchesterpa #wcshoplocal #wcshopsmall #chestercountyshopping #thriftboutique #alternativestyle #thegspotthrift #thegspot #modernretrostyle #modernretroboutique #ladyinred (at G Spot Thrift Boutique) https://www.instagram.com/p/B374nY5jF9u/?igshid=1kfayqjbm8uml
0 notes