• Ocelot@lemmies.world
    link
    fedilink
    English
    arrow-up
    104
    arrow-down
    2
    ·
    edit-2
    10 months ago

    oh god i felt this one. Devs too busy, incompetent or just plain lazy to figure out why their code is so slow, so just have ops throw more CPU and memory at it to brute force performance. Then ops gets to try to explain to management why we are spending $500k per month to AWS to support 50 concurrent users.

    • Vlyn@lemmy.zip
      link
      fedilink
      English
      arrow-up
      41
      arrow-down
      2
      ·
      10 months ago

      The sad thing: Throwing hardware at a problem was actually cheaper for a long time. You could buy that $1500 CPU and put it in your dedicated server, or spend 40 developer hours at $100 a pop. Obviously I’m talking about after the easy software side optimizations have already been put in (no amount of hardware will save you if you use the wrong data structures).

      Nowadays you pay $500 a month for 4 measly CPU cores in Azure. Or “less than 1 core” for an SQL Server.

      Obviously you have a lot more scalability and reliability in the cloud. But for $500 a month each we had a 16 core, 512 GB RAM machine in the datacenter (4 of them). That kind of hardware on AWS or Azure would bankrupt most companies in a year.

    • Aceticon@lemmy.world
      link
      fedilink
      arrow-up
      9
      ·
      edit-2
      10 months ago

      Well, having been on the other side, sometimes the Dev is also trying to fight the good fight whilst having to use some crap 3rd party system/library that’s imposed from above because somebody at the C-suite level after suitably dinned and wined (and who knows what more, including implied or even explicit promises for the future of their career) signed a massive agreement with one of the big corporate software providers so now those of us at the coalface have to justify to money spent on that contract by using every POS from said big corporate software provider.

      I mean, I might be exagerating the overtly corrupt nature of the deal (in my experience its more a mix of CTO incompetence - or being pretty much powerless at the C-Suite level because his is not the core business, hence overriden - and the high-level management trading favours using company money and more for personal rather than corporate reasons) but even competent devs that know their thing can’t really do much when they have to use a bug-riddled POS massive framework from some vendor that doesn’t even have proper support, for “corporate reasons”.

      • phoenixz@lemmy.ca
        link
        fedilink
        arrow-up
        8
        ·
        10 months ago

        I got somebody at the C-suite level fired after I presented evidence of him wining and dining with a shit supplier (actually being buddy buddy and literally dining with him on a weekly basis), also for not knowing the consequences of his decisions and also for him bring unable to keep his hands off employees below him (me included).

        Within 3 months there were 5 severe complaints against him with the CEO and humans resources.

        The company had whistleblower protections but obviously fired me for my troubles as well anyway.

        I don’t care, the fucker was evil and the company honestly too and I’m happy I’m gone there.

          • phoenixz@lemmy.ca
            link
            fedilink
            arrow-up
            3
            ·
            10 months ago

            It sucked in the moment, but now I’m more than fine with it. I see the company for what it is now, quite evil and a detriment to society. I’m happy I’m gone there.

  • 1984@lemmy.today
    link
    fedilink
    arrow-up
    71
    arrow-down
    1
    ·
    edit-2
    10 months ago

    This happens all the time. Companies are bleeding money into the air every second to aws, but they have enough money to not care much.

    AWS really was brilliant in how they built a cloud and how they marketed everything as “pay only for what you use”.

    • sunbeam60@lemmy.one
      link
      fedilink
      arrow-up
      22
      ·
      10 months ago

      We worked with a business unit to predict how many people they would migrate on to their new system week 1-2 … they controlled the migration through some complicated salesforce code they had written.

      We were told “half a million first week”. We reserved capacity to be ready to handle the onslaught.

      8000 appeared week 1.

      • 1984@lemmy.today
        link
        fedilink
        arrow-up
        20
        arrow-down
        1
        ·
        edit-2
        10 months ago

        I mean, I would put brilliant in quotes in the way that it’s brilliant for their profits. Not brilliant in the way of making the world a better place.

        • Oderus@lemmy.world
          link
          fedilink
          arrow-up
          10
          ·
          10 months ago

          Companies hate OpEx and love CapEx. That’s the main driver as companies loathe hardware life cycle costs and prefer a pay as you go model. It is more expensive but it’s more budget friendly as you avoid sticker shock every 3-4 years.

    • brennesel@feddit.de
      link
      fedilink
      arrow-up
      2
      ·
      10 months ago

      Do you mean that it’s still the case that more resources are allocated than actually used or that the code does not need to be optimized anymore due to elastic compute?

      • 1984@lemmy.today
        link
        fedilink
        arrow-up
        10
        arrow-down
        1
        ·
        edit-2
        10 months ago

        I think both are consequences of the cloud.

        It’s cheaper for companies to just add more compute than to pay devs to optimize the code.

        And it’s also not so important to overpay for server capacity they don’t use.

        Both of these things leads to AWS making more money.

        It’s also really good for aws that once these things are built, they just keep bringing in money on their own 24 hours per day.

        • brennesel@feddit.de
          link
          fedilink
          arrow-up
          2
          ·
          10 months ago

          If I remember correctly, that was the original idea of AWS, to offer their free capacity to paying customers.

          Do you think that AWS in particular has this problem or Azure and GCP as well? I have mainly worked with DWHs in Snowflake, where you can adjust the compute capacity within seconds. So you pay almost exactly for the capacity you really need.
          Not having to optimize queries is a good selling point for cloud-based databases, too.

          It is certainly still cheaper than self-hosted on-premises infrastructure.

  • huginn@feddit.it
    link
    fedilink
    arrow-up
    42
    ·
    10 months ago

    Meanwhile I’m given a 16gb of ram laptop to compile Gradle projects on.

    My swap file is regularly 10+ gigs. Pain.

    • lemmyvore@feddit.nl
      link
      fedilink
      English
      arrow-up
      9
      ·
      10 months ago

      That reminded me about trying to compile a rust application (Pika Backup) on a laptop with 4 GB of RAM (because AUR).

      That was a fun couple of attempts. Eventually I just gave up and installed a flatpak.

  • RandomVideos@programming.dev
    link
    fedilink
    arrow-up
    42
    arrow-down
    8
    ·
    10 months ago

    Why did it change from 64 gb of ram to 1.268869321 E+89(64!) gb of ram

    Also, 2.092278988 E+13(16!) gb is a lot more than 64 gb

  • yukichigai@kbin.social
    link
    fedilink
    arrow-up
    33
    arrow-down
    1
    ·
    10 months ago

    Bonus if the vendor refuses to provide any further support until your department signs off on the resource expansion.

    In a just world that’s when you drop the vendor. In a just world.

  • abbadon420@lemm.ee
    link
    fedilink
    arrow-up
    38
    arrow-down
    7
    ·
    10 months ago

    64! is a whole lot more than 64 though. It’s a number with 90 digits.

  • marcos@lemmy.world
    link
    fedilink
    arrow-up
    22
    arrow-down
    3
    ·
    10 months ago

    Yeah, almost certainly the software only uses 4GB because it limits itself to what memory it has available.

    I have seen this conversation pan out a few times already. It has always been because of that, and once expanded things work much better. (Personally I have never took party at one, I guess that’s luck.)

  • nieceandtows@programming.dev
    link
    fedilink
    arrow-up
    19
    ·
    edit-2
    10 months ago

    Flip side of the coin, I had a sysadmin who wouldn’t increase the tmp size from 1gb because ‘I don’t need more than that recommended size’. I deploy tons of etl jobs, and they download gbs of files for processing to this globally known temp storage. I got it changed for one server successfully after much back and forth, but the other one I just overrode it in my config files for every script.

    • stevecrox@kbin.social
      link
      fedilink
      arrow-up
      11
      ·
      10 months ago

      This is why Java rocks with ETL, the language is built to access files via input/output streams.

      It means you don’t need to download a local copy of a file, you can drop it into a data lake (S3, HDFS, etc…) and pass around a URI reference.

      Considering the size of Large Language Models I really am surprised at how poor streaming is handled within Python.

      • nieceandtows@programming.dev
        link
        fedilink
        arrow-up
        8
        ·
        10 months ago

        Yeah python does lack in such things. Half a decade ago, I setup an ml model for tableau using python, and things were fine until one day it just wouldn’t finish anymore. Turns out the model got bigger and python filled out the ram and the swap trying to load the whole model in memory.

        • stevecrox@kbin.social
          link
          fedilink
          arrow-up
          4
          ·
          10 months ago

          During the pandemic I had some unoccupied python graduates I wanted to teach data engineering to.

          Initially I had them implement REST wrappers around Apache OpenNLP and SpaCy and then compare the results of random data sets (project Gutenberg, sharepoint, etc…).

          I ended up stealing a grad data scientist because we couldn’t find a difference (while there was a difference in confidence, the actual matches were identical).

          SpaCy required 1vCPU and 12GiB of RAM to produce the same result as OpenNLP that was running on 0.5 vCPU and 4.5 GiB of RAM.

          2 grads were assigned a Spring Boot/Camel/OpenNLP stack and 2 a Spacy/Flask application. It took both groups 4 weeks to get a working result.

          The team slowly acquired lockdown staff so I introduced Minio/RabbitMQ/Nifi/Hadoop/Express/React and then different file types (not raw UTF-8, but what about doc, pdf, etc…) for NLP pipelines. They built a fairly complex NLP processing system with a data exploration UI.

          I figured I had a group to help me figure out Python best approach in the space, but Python limitations just lead to stuff like needing a Kubernetes volume to host data.

          Conversely none of the data scientists we acquired were willing to code in anything but Python.

          I tried arguing in my company of the time there was a huge unsolved bit of market there (e.g. MLOP’s)

          Alas unless you can show profit on the first customer no business would invest. Which is why I am trying to start a business.

  • ericbomb@lemmy.world
    link
    fedilink
    arrow-up
    9
    ·
    10 months ago

    narrows eyes

    Look I don’t “think” that was me this last few weeks. I’m pretty sure my support engineer butt was smart enough to check resources before blaming RAM…

    But it totally could have been me, and in that case I blame dev.

  • mvirts@lemmy.world
    link
    fedilink
    arrow-up
    4
    ·
    10 months ago

    I loath memory reservation based scheduling. it’s always a lie, always. Looking at you, Hadoop.