• j4k3@lemmy.world
        link
        fedilink
        English
        arrow-up
        4
        ·
        11 days ago

        It has a lot of potential if the T5 can be made conversational. After diving into a custom DPM adaptive sampler, there is a lot more specificity required. I believe the vast majority of people are not using the model with the correct workflow. Applying the old model workflows to SD3 makes garbage results. The 2 CLIPS models and the T5 need separate prompts, and the negative prompt needs an inverted channel with a slight delay before reintegration. I also think the smaller quantized version of the T5 is likely the primary problem overall. Any Transformer text model that small, that is them quantized to extremely small size is problematic.

        The license is garbage. The company is toxic. But the tool is more complex than most of the community seems to understand. I can generate a woman lying on grass in many intentional and iterative ways.

        • brucethemoose@lemmy.world
          link
          fedilink
          English
          arrow-up
          2
          ·
          11 days ago

          Yeah, and it’s just fp8 truncation right? Not actual “smart” quantization? That’s even a big hit for huge decoder-only llms.