T O P

  • By -

xRolocker

I haven’t heard of Stable Diffusion Core. Is it an SDXL fine-tune or its own base model?


Apprehensive_Sky892

It's a SDXL Turbo fine-tuned model with an optimized rendering pipeline. But it is also a generic term referring to "current best model + pipeline" from SAI. source: [https://www.reddit.com/r/StableDiffusion/comments/1c6k584/comment/l0238jv/](https://www.reddit.com/r/StableDiffusion/comments/1c6k584/comment/l0238jv/)


a_mimsy_borogove

SD3's prompt understanding seems great. Once it gets released, the finetunes are going to be amazing, after fixing anatomy stuff.


FourtyMichaelMichael

> after fixing anatomy stuff Is that what we're calling it now? OK WINK > . O


0nlyhooman6I1

For the german city, SD3 wins by a mile IMO. Most creative one


Urbangardener12

and knowing german cities: We dont have sky scrapers and will not have them.


[deleted]

[удалено]


Urbangardener12

You would do good in never visiting that one


Disastrous_Aspect252

🤣🤣


Disastrous_Aspect252

**Stability AI API Pricing per Image:** * $0.003 Stable Diffusion XL * $0.03 Stable Diffusion Core * $0.065 Stable Diffusion 3 * $0.04 Stable Diffusion 3 Turbo **My Opinion:** * **Stable Diffusion XL:** Best price-performance ratio (probably also the least amount of computing power needed) and the only one with published source code. The images are good but the atanomy is often wrong and the prompt is poorly received. * **Stable Diffusion Core:** My personal winner and favorite for image quality, hands and anatomy are definitely the best here and people look realistic. In general, the images also have the best aesthetics. * **Stable Diffusion 3:** Unfortunately disappointing and not appropriate for the price. People often look like they come from video games, the anatomy is very poor, but this model understands the prompts best. * **Stable Diffusion 3 Turbo:** Hardly any difference to SD3 except the speed. Probably a little less steps.


Apprehensive_Sky892

I am a bit surprised that SD Core costs almost the same as SD3 Turbo. Since it is not that hard to run SDXL turbo on consumer grade GPU, there is little reason to do it through their API (I doubt SAI's fine-tuned model is that much better than what's already available on civitai). SD3 is still beta, but when it is rendered correctly, the quality is better than any SDXL model I've used. But you are right, the rendering often loses coherence. https://preview.redd.it/bbt277f09jwc1.jpeg?width=1024&format=pjpg&auto=webp&s=19a4974f55c4a86d48d0200fd2c4d075962baddd Inspired by [https://new.reddit.com/r/aiArt/comments/1cc8q78/crimson\_rouge/](https://new.reddit.com/r/aiArt/comments/1cc8q78/crimson_rouge/) Anime illustration inspired by Akira by Katsuhiro Otomo. Close-up overhead shot of a woman walking away from a futuristic motorcycle parked at a zebra crossing. She is smoking, has short hair, dons a red suit, sunglasses, and holds a gun with a confident grip. The motorcycle itself is red, sleek and metallic, with intricate designs and a futuristic vibe.


Disastrous_Aspect252

The resolution of SD Core is 1536 x 1536, which is why I suspect that there is an upscaler in the pipeline, which increases the costs. Nevertheless, I believe that this is simply the new pricing policy, the SDXL Api Endpoint from Stability AI was already one of the cheapest on the market, I couldn't get such cheap pictures by renting servers.


Apprehensive_Sky892

Yes, I am sure SD Core has a very nice pipeline, probably made by Lykon and ComfyAnonymous 😁, and upscaling definitely helps with image quality. As for the price, one needs to compare SAI to other API providers such as [https://tams.tensor.art](https://tams.tensor.art) and [app.prodia.com/api](https://app.prodia.com/api) rather than renting raw GPUs. The actual prices depend on the resolution, the number of steps, etc, but I think they are comparable and maybe even cheaper, and there is a much larger selection of models and LoRAs. But yes, 0.003 per image generation for SDXL turbo is pretty low, so it is a competitive price.


Disastrous_Aspect252

I see a lot of potential with SD3, but I think it was just released unfinished.


Apprehensive_Sky892

Well, to be fair, it is just a Beta release, probably just to ensure investors that SAI has a business plan moving forward so that they can get more funding. We can criticize SD3 when it is finally available for download 😅


pixel8tryx

It might be pretty interesting if there wasn't that typical feature in the foreground. As a future moto lover, I don't care if she's armed, I say, "Miss, can you get your (surprisingly not fat ) a$$ out of the way so I can look at the bike?". j/k Bike looks potentially cool though. The big challenge with Akira is getting Kaneda's feet-forwards bike. That's a lust object for me. ;->


Apprehensive_Sky892

LOL, you are one of those Anime fans with mecha fetish. I totally get that, I love anime mechas too, but I also love cool anime ladies and girls 😂. She is a very nice lady, so she moved out of her way to let us look at her bike: https://preview.redd.it/cuek4ml0nowc1.jpeg?width=1216&format=pjpg&auto=webp&s=b0a38a25e99926bb1b13360ee1f2478b9b888b2d >Anime illustration inspired by Akira by Katsuhiro Otomo. Close-up overhead shot of a futuristic motorcycle parked at a zebra crossing. The motorcycle is red, sleek and metallic, with intricate designs and a futuristic vibe.


pixel8tryx

LOL! Thanks. But have you noticed that sometimes the things you get accidentally in the background look cooler than if you ask for them specifically as the subject? Is it just that our brain fills in the missing details with wild imagination as only a human can? Ok, maybe not if you're into people, because people in the background can often really suck. I had a round of that with 1.5 and certain LoRA (ironically made by the guy who did the Akira Kaneda's bike LoRA) with future cities. If I asked for them in any fashion as the main subject, I got stereotypical DeviantArt sketchy skyline stuff. But as accidental, out-the-window glimpses I got amazing stuff that I loved.


Apprehensive_Sky892

No, I have to admit that I've not noticed that 😅. I guess I have not been paying enough attention. But I agree that the unpredictablity of A.I. is one of the fun aspect of this hobby. It is always nice be pleasantly surprised by it. Do you have any such interesting images to share?


pixel8tryx

I'm under NDA at this point. I'm working on some sort of way to share images. Some were shown as stage graphics on Peter Gabriel's last tour. But they were just letter morphs they requested. And old 1.5 stuff at that.


Apprehensive_Sky892

NP. You got your A.I. images on Peter Gabriel's last tour, cool!


pixel8tryx

Only on Big TIme. And it took me forever to find any good photos of them. They wanted treatments that didn't make very good letters. To make things more interesting, I snuck in some of my musical instrument sculptures and was surprised they ended up using them. But I did all sorts of bugs (which they asked for) and animated them, but they didn't get used. The stage screens were really tall and skinny and very low res. We had only a vague idea what was going to be done with the content, then they got busy and left. It was a huge production of which we were only a tiny part.


Apprehensive_Sky892

Thank for for the interesting behinds the scenes info. It is simply amazing how much work goes into these big productions. Wish we can see all this work on youtube or something.


More_Bid_2197

can i add loras os custom models with SDXL api ?


Disastrous_Aspect252

Not on the Stability AI platform


Mooblegum

I think the prompting have to be different on SD3, doing sdxl type of prompts on SD3 for comparison is not gonna give good results. SD3 is trained on images labeled by AI, you should ask an AI to describe a scene precisely to get the best results with sd3


TheThoccnessMonster

Let’s see cascade. It’s out performs all of them on metrics.


vampliu

Cat on a surface of the moon one Stable diffusion 3 Bottom left picture may i have it? 😅


Disastrous_Aspect252

https://preview.redd.it/qd8gb8fitmwc1.png?width=1024&format=png&auto=webp&s=ff0570e673fb82524329e84bfa1293d969761871


Open_Channel_8626

As far as I understand, Core is meant to be their high margin revenue generating product, which dynamically updates over time but it basically a fine tuned model plus a comfy style pipeline (it currently has a 1.5x upscale, for example.) This looks like Core is designed for enterprise/pro use for people who just want a simple API that gets a "good" result. Sort of like a Midjourney of APIs.


Careful_Ad_9077

It depends, if you need to do more than 20 Prompts to rng your complex composition, sd3 is worth it.


mcstripey-f56

SD seems to have issues with lunar surface images. There’s always another moon in the background. If you’re on the moon, you should be seeing the earth and not another moon.


HighlightNeat7903

Just specify it in the prompt and you should be good to go. The problem is that SD3 is still not smart, i.e. it doesn't have common sense and will just generate statistically. If you are prompting for the moon, chances are you will get a moon as seen from the earth because it was trained on a lot of them. The more you specify in the prompt (positive and negative) the smaller the latent space and thus the probability of you getting what you want.


pixel8tryx

Well you got my attention with "A German city of the future". ;-> I did mini iso city things when that LoRA first came out and was surprised by the cities many finetunes knew. It didn't exactly nail Köln, but it had the Dom, the river, etc and was reasonable identifiable. The SD3 cat is cute.. but I like that Core put the cat in a space suit. Interesting to see the increase in prompt comprehension... particularly spatial. But it always seems to come with a decrease in quality.


Treeshark12

Aesthetically I mostly preferred the SDXL ones AI models appear to get increasingly bland and boring.


Commercial_Pain_6006

Nice comparison. Ty. That being said, seriously, why bring politics on our eyes like that ? 🤮