The TRACTOR program aims to automate the translation of legacy C code to Rust. The goal is to achieve the same quality and style that a skilled Rust developer would produce, thereby eliminating the entire class of memory safety security vulnerabilities present in C programs. This program may involve novel combinations of software analysis, such as static analysis and dynamic analysis, and machine learning techniques like large language models.

Highlights from the forum thread:

There’s even a conspiracy theory that the Rust Foundation’s 501 organization type was chosen so it can conduct lobbying. The implication being that the Rust Foundation is behind government recommendations to move toward memory safe languages. (Big Borrow-Checker, if you will).

Assuming a worst case scenario, this could be the worst thing to happen to Rust’s image. We end up with billions of lines of rewritten Rust code that is full of soundness and logic bugs, and that no one understands.

DARPA funds some projects on a “there is an infinitesimal chance of success, but if you succeed, it’s a big deal” basis. Silent Talk is an example here - very unlikely to succeed, even at the beginning, but if you could hold a radio conversation without sound, that’d be a huge deal for special operations forces.

    • simple@lemm.ee
      link
      fedilink
      English
      arrow-up
      9
      ·
      2 months ago

      Translating entire codebases with LLMs? What could POSSIBLY go wrong?

      I also don’t see how it would ever be possible to directly translate C to Rust. They’re so fundamentally different that things are bound to not work the same.

      • IllNess@infosec.pub
        link
        fedilink
        English
        arrow-up
        5
        ·
        2 months ago

        I don’t even understand how they are going to get around the memory security they are doing this translation for. Watch them have to break the security features of Rust just to make certain programs work.

  • astronaut_sloth@mander.xyz
    link
    fedilink
    English
    arrow-up
    3
    ·
    2 months ago

    I think this is an interesting idea. If they’re able to pull it off, I think it will cement the usefulness of LLMs. I have my doubts, but it’s worth trying. I’d imagine that the LLM is specially tuned to be more adept at this task. Your bog-standard GPT-4 or Claude will probably be unreliable.

    • ByteOnBikes@slrpnk.net
      link
      fedilink
      English
      arrow-up
      1
      ·
      2 months ago

      Having built code converters for the same language to auto migrate to a later version of that language, I’m incredibly worried. We still had to manually verify every thing.

      I’m hopeful though that this does become the wave of the future. There’s some serious legacy shit out there that doesn’t have enough of a financial gain to revisit and rewrite.

  • solrize@lemmy.world
    link
    fedilink
    English
    arrow-up
    1
    ·
    2 months ago

    Maybe it would be easier to translate to Ada? That is for C code that doesn’t make heavy use of malloc/free. The idea of Rust’s borrow checker as I understand it is to statically track the references to malloc’d memory to make sure that you never use-after-free or double-free. If your C code uses malloc in uncontrolled ways, then massaging it to satisfy a borrow checker sounds horribly difficult and you should either give up, or run it under a very managed environment like valgrind. If (as is typical of embedded code) it just does stuff with some fixed memory buffers and doesn’t do much runtime allocation, then there isn’t anything for a borrow checker to look after, so you can use a safe language (Ada) that doesn’t have borrow checking.

    Disclaimer: I don’t use Rust at the moment. Someday. I do like Ada despite its verbosity, but it’s not that great at managing dynamic memory. It is starting to take on Rust influences to help with that.