Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

This kind of syntactic sugar used to appeal to me, but now I think it's a pretty weird feature to add to a language. Using zip / enumerate primitives feels a lot more flexible.


Depends on what you mean by flexible. If you want to use them outside of loops then they could cause magic data copies behind the hood. Zig really hates hidden control flow/allocations/copies. Within the for syntax it's pretty straightforward what gets assigned to what variables and how copies can be avoided.

Doing things like `a = @zip(some_list, some_other_list)` can be reasoned about in multiple ways, some of which involve silently calling malloc. It's particularly unclear what could be done with `a` afterwards. Zig hates that kind of ambiguity and is happy to err away from flexibility at times.


Rust also hates hidden allocations, and its iterator system can do all of this without them

Although- thinking about it, that may rely on the borrow checker (move semantics specifically)


That's nothing special though, `zip` just takes an item from each iterator, packs them into a tuple, and yields that. It has no weird bounds or requirements or anything: https://doc.rust-lang.org/std/iter/fn.zip.html

The impl of the default `next` is:

    fn next(&mut self) -> Option<(A::Item, B::Item)> {
        let x = self.a.next()?;
        let y = self.b.next()?;
        Some((x, y))
    }
So completely straightforward.


I was thinking about the fact that whatever you're iterating over has to be copied around throughout the process. Rust can guarantee that eg. deep-copies (clones) of allocated structs will never happen implicitly, if your iterator owns the values being iterated. But in languages where copying can trigger allocations, this could be a problem

I don't actually know whether that applies to Zig though


You're assuming that an iterator gets created and returned by zip. In zig the for loop syntax doesn't need the concept of an iterator. All you need is variables in a capture.

That's what I mean by straightforward. You'd have to argue about the memory layout for an iterator, that you call functions on it etc. Lots of small decisions. In zig those things would usually be part of the standard library, not language features.

If you want to call a next() function on a thing in zig, you want to be explicit. That's what the language means by "no hidden control flow". You'd use a while loop if you wanted to iterate on a hashmap for example.

In some sense, this is less elegant, but it's usually more obvious what the compiler would end up doing in any bit of code.


Can that zip more than two iterators? And does it perform a bounds check on each call to `a.next()` and `b.next()`?


It stops when one of the two iterators ends

It can't zip more than two per se, but you could zip the result of the first zip into a third and get ((item1, item2), item3). You could then map these if you wanted, to flatten them into a single tuple .map(|((item1, item2), item3)| (item1, item2, item3))

Of course there's a trade-off here between ergonomics and generality


> It can't zip more than two per se, but you could zip the result of the first zip into a third and get ((item1, item2), item3). You could then map these if you wanted, to flatten them into a single tuple .map(|((item1, item2), item3)| (item1, item2, item3))

FWIW that's more or less what `itertools::izip!` does for you, it just chains `zip`s then "splats" them using a `map`.


> It stops when one of the two iterators ends

Right; my question is, suppose you're iterating over two slice iterators - won't each call to `a.next()` and `b.next()` have to check whether that sub-iterator is done? One of the benefits of the Zig approach is that you can iterate over an arbitrary number of slices and do one check before entering the loop, followed by the compiler emitting unchecked index access in the loop. So it basically compiles down to the equivalent of a C `for` loop.


Rust's zip has a specialisation for iterators with a trusted length. Such as slice iterators.

`zip` yields exactly the same assembly as a loop over the index range with an unsafe item access: https://godbolt.org/z/7ebfxbhxc


For Zip it's TrustedRandomAccess[0] instead of TrustedLen. Imo the most radioactively unsafe trait in the standard library and will likely never be stabilized in its current form.

[0] https://github.com/rust-lang/rust/blob/f540a25745e03cfe9eac7...


That's cool. At the same time though, it almost feels like a distinction without a difference in some ways - Zig has a special built-in syntax; Rust doesn't use special syntax, but it does use complex special-cased unsafe code in the stdlib in order to implement a safe + performant API.


Rust's is built on top of (and exposed to) Iterators, which are a very general concept that can be rooted in all kinds of data structures, composed in all kinds of ways, and collected/processed in all kinds of ways (i.e. the user's code might not even contain an actual loop). The code continues to work in many situations, even where the optimization doesn't apply

You trade some special-case syntax and ergonomics for that generality, but it is very general even if not all of it is optimized in the same way


On the other hand, the "special cased unsafe code" is applicable to more than just zip, more than just the one array type, and is available in userland (though currently unstable so nightly only, both to implement it on a bespoke type and to rely on it).


You have to pass -O though, the point of Zig's for loop syntax is to get fast compile times and good performance also in debug mode :^)


Interesting, are "trusted-length" iterators something that might ever make it into userspace? Maybe as const generics?


It’s already in userspace, though nightly (and unsafe, obviously), so whether it’ll be stabilised, and in what form, is an open question: https://doc.rust-lang.org/std/iter/trait.TrustedLen.html


You can zip a zipper to combine them, but you end up with an item like (a, (b, (c, d))), not a huge deal since it gets destructured automatically.

The itertools crate has a multizip function/macro that allows zipping as many iterators as you'd like without nesting zips inside of zips!

I really recommend checking out itertools if you use rust and like iterators! It's a direct dependency of rustc itself too, which suggests its a trusted project (it may even be maintained by rust devs, not totally sure).


>Rust also hates hidden allocations

Does it? Rust seems happy to allocate silently all the time.

    let x = String::new(“hi”);
    let y = vec![];
Do either of these allocate? As the writer or reader of this code, how do I know if either of these statements result in a heap allocation, or if the data is strictly on the stack?

Zig’s requirement of explicitly passing around an Allocator type removes any ambiguity completely.


Sure, any arbitrary function (or macro) logic can allocate. It's more a philosophy, not something that's language-enforced[0] in Rust- if you're creating a mutable, variable-size data structure like a String or a Vec or a HashMap you're not going to be very surprised that it allocates at some point (though technically zero-length Vecs don't allocate on construction, they wait until an item is added)

But closures don't require allocation, iterators don't require allocation, async doesn't require allocation. Copy semantics also don't allow allocation- implicit copies can only happen for data structures that are bitwise-copyable, which is enforced by the compiler. For copy-with-allocation you have to implement the Clone trait, and then invoke it explicitly with the .clone() method

But the original context was a question of philosophy, so I was only speaking to Rust's overall philosophy

[0] Technically I think if you're using no_std you won't have access to any standard constructs that allocate (which obviously will prevent their use at compile-time), though I believe you're still allowed to eg. call out to foreign functions manually that would allocate. And of course, this still isn't as granular as Zig's allocation-control.


> Technically I think if you're using no_std you won't have access to any standard constructs that allocate

Indeed, because you only have core, and not alloc, which is where the allocator lives.

Mostly, core looks very similar to the functionality you can get of the same types in std - except that of course types which allocate are missing (Vec, HashMap, Box, String, etc.) and all the types which reflect operating system services are likewise missing (UdpSocket, File, Mutex, etc.)

However there are deviations. For example, Rust's slices in core don't have a sort() method. Why not? Rust provides a stable sort which uses an allocator because that's much faster, it doesn't bother providing a not-so-good stable sort that can work without using an allocator, if you can't afford an allocator but you want stable sort that's a rather niche case, Rust won't solve it. Rust's sort_unstable() doesn't need an allocator and so that is provided on slices even with only the core library.

Rust For Linux, the Linux kernel's Rust, has core, plus its own somewhat customised twist on alloc, plus an entirely custom kernel library. In Rust For Linux, the allocators are all explicit because Linus hates implied allocation. So for example in Rust For Linux a Vec doesn't have a push() method, because push() on a Vec can cause the Vec to grow, which is an allocation - instead Rust For Linux provides Vec::try_push() which fails if you'd need to grow the Vec before it could successfully push.


The Rust docs are very clear that initializing an empty vector (or hashmap) does not cause an allocation and is thus equivalent to using `Vec::with_capacity(0)`. The same applies to strings because strings are vectors of bytes with additional methods. You can also use the SmallVec library to store the first n items in an array on the stack and spill onto the heap once that length is exceeded.

Whether a clearly visible `vec.push(x)` call is silent to you depends on your viewpoint. I'd say it's recognizeable as potential allocation, it has to be when pushing to a dynamic structure of arbitrary size - but I also understand how this can be seen differently. This is, I think, where Rust and Zig differ. Rust makes most expensive operations explicit but still allows things like this or macros, Zig tries to avoid even that.


> Zig’s requirement of explicitly passing around an Allocator type removes any ambiguity completely.

Not really...

    var list = std.ArrayList(u21).init(std.testing.allocator);
    try list.append('a');
Does creating the ArrayList allocate memory? Is appending to the array list reallocates? The only thing you know is that ArrayList will sometime use the allocator, you can't tell when. And this gets worse once the abstraction level increases, if you have your own type which accepts an allocator and has 100 methods it's impossible to tell which methods allocate.

In both languages you still need to know the specifics of the data structures you are working with, the only real problem that Zig solves is to actually allow you to use custom allocators with the std collections, which is something that is still unstable in Rust.


You can tell when, because if it uses the allocator it will return an error. So the first line definitely doesn't allocate, and the second definitely does.

That is, unless you explicitly handle OOM conditions inside your construct, e.g. 'crash if you're OOM', which isn't typical in zig code. All code I interact with will return an allocator error if allocation fails.


For what it's worth (I can't tell if you knew) vec![] doesn't allocate. Actually in this context it likely won't even compile, although it depends.

vec![] says I want a Vec with nothing in it. That doesn't require an allocation. In effect you are asking for three things, a pointer, of the correct type but not actually pointing at anything, and two integers (length and capacity) which are both zero. These three things don't live on the heap, they're a local variable on your stack.

Rust's first question here is: A vec of what? If we wrote something in those square brackets, it could guess the type of that, but we didn't. It may be able to infer the type from what is pushed into this Vec named y later, except we didn't make it mutable so we can't push anything into it - or if it's returned, from the return type (Rust does not allow function parameter or return types to be inferred). If Rust isn't able to infer the type, that's an error during compilation.

  let mut y: Vec<()> = vec![];
That says I want a mutable but initially empty Vec of the empty tuple. Rust is fine with that, and the result is a Vec which can hold up to isize::MAX of the empty tuple. It won't allocate, because the empty tuple doesn't take up any actual space - it's empty. Since empty tuples are indistinguishable in some sense Rust is really destroying them when you put them in the Vec and then making fresh ones to order when you remove them and there's just no way to tell.

  let mut y: Vec<u8> = vec![];
Now we're making a mutable, initially empty, Vec of bytes. This still doesn't allocate... yet. However if we try to push a byte into it, or we ask to ensure there is space for one or more bytes in it, that will allocate.

  let y: Vec<u8> = Vec::with_capacity(1);

This allocates immediately. Even though we promised we won't actually mutate this Vec, we insisted it be created with capacity for at least one byte, which will mean an allocation. Rust may end up allocating for a modest capacity larger than one byte, since it's likely the underlying system works in larger "chunks".


(You’re forgetting a “new” in the string example)


Thanks, I fixed it :)


In general Zig foregoes syntactic sugar and requires implementing higher-level APIs by composing primitives. But a new language feature is a candidate when it solves a use case that can't otherwise be solved, or opens up a path to more efficient code.

Loris' blog post points out that the new for loops address the latter:

> In the multi-sequence for loop version it’s only necessary to test once at the beginning of the loop that the two arrays have equal size, instead of having 2 assertions run every loop iteration. The multi-sequence for loop syntax helps convey intention more clearly to the compiler, which in turn lets it generate more efficient code.

It also builds on existing properties of slices/arrays, rather than adding a new "enumerate primitive".


This is my take as well. The older and more travelled I get the more I disdain these kinds of things. Your language syntax should do whatever the "thing" is that your language model is all about. Syntactic sugar should be for the things you do LOTS.

I watch language after language add sugar to maintain the appeal of their product, one niche group or application at a time. It turns into a death by a thousand cuts, or by a thousand sugar cubes. Most languages start out simple and appealing and understandable, an increasingly short amount of time later, they've layered on "helper" after "helper" to the point it takes a bit of expertese to consume the language effectively.

I dream of a world where we'd measure languages by the complexity of their ASTs rather than their popularity on a TIOBE or StackOverflow index.


Arguably this change to the zig language is overall a simplification because the loop index capture is no longer a special case.


Could be. I think you're more the expert here than me? :D

To me, the followin is a bit of syntactic sugar that I think is the kind of transcendental "go big/basic with it" that I hint at.

Some time ago, I worked in a language that had this idea that any composable block of code could be captured as 0-N statements between the characters [ and ]. They thought they were being clever and called it a "block of code". Which I thought was cool, because it looked like a block. Pedants called it a BlockClosure. If you wanted to pass parameters to one of these, they used a colon denoted list. So a two arg block might look like

[:a :b | <code goes here> ]

So yay, pass a closure to a service, and it "captures" the values be invoking said closure with arguments.

And then the authors thought, okay, enough sugar for a few days, let's just use this. I mean really really really use this.

You can use a two arg block like that for a zip function of course, but why limit it to iteration? Use it in the standard library to implement the "for each" function. Which when you looked at was just that "how dare they not have a for syntax" while implementation. But because it wasn't embedded in the syntax, you could copy/paste/modify to come up with a filter iteration. Or a reduce. Or a map. Or all kinds of interesting compositions "selectAndCollectAndReject" with 3 closures.

And why stop there? They decided, "let's just do boolean logic with these block things too". So where as most languages has special syntax for conditionals (and once they start, they're in competition with their peers to keep adding more and more of them (do while, case, if, if with N elses, on and on). But they just wrote it like

<condition> ifTrue: [trueBlock] ifFalse: [falseBlock]

Sure they optimized it, but from a linguistic point of view, it was the same thing as above. No new sugar was needed.

Whereas many languages have added sugar for optionals (usually involving ?s), this language, 20 years ago, was doing it with closures already. Someone noticed they could implement the following family of "functions"

ifNil: [nilBlock]

ifNil: [nilBlock] notNil: [:notNilValue | notNilBlock]

ifNotNil: [:notNilValue | notNilBlock]

Sure, not as terse as ? (which some endeavoured to deal with), but the language semantics didn't have to change each time there was a new thing to do.

I'm sure there's a Lisper out there that can write their analog to the above. Because it too, is was one of these "do much with little" langauges.


You're essentially describing Lisp already.


Working on low level performance sensitive code in games, this is something I see in code LOTS.

As mentioned in the article, data oriented design runs into the pattern of wanting to iterate over parallel arrays of data frequently.


"Sugar" is implemented by converting an AST into itself, so it wouldn't change its "complexity" at all.


It isn't more flexible. Maybe more expressive or convenient, since you are only describing the flow in a cursory manner.

Sometimes you will need to be explicit about what you want to happen, and then a proper loop facility will let you do that.


To me this looks a lot like closure syntax w/ non-local exits. Seems quite reasonable for a functional programming language.


I think it matters what your target use cases are. This makes me think quite a few people are running ECS systems.


How cache-friendly are zip/enumerate implementations? Zig is influenced by the ideas behind Data Oriented Design, mentioned in the article (and a buried lede, if you ask me). Explicit for loops like this are generally cache-friendly and ideal for eg game programming, as shown in the structs of arrays example.


I tossed together a simple function using enumerate https://godbolt.org/z/PKsEdKvKK

You get the same exact asm as the manual loop.

Of course, the idiom recognition seems to kick in, in both cases, as there's no actual loop here. I tossed in a +sum, which makes that fail, so you get some loops, check it out:

https://godbolt.org/z/1ddf5ded7

They are one instruction different in length, which is kind of amusing to me. Some small differences.


Thanks, that’s exactly what I was asking. I don’t write Rust so it’s informative to see this.


Any time.


As cache-friendly as advancing two pointers and a bounds check.


I don't know how common working with ranges is in Zig. Ruby would iterate on multiple ranges by converting them to arrays

  one = (1..3)
  seven = (7..10)
  (one.to_a + seven.to_a).each {|n| puts n}
I suppose that if it was common they would have added a + method to Range. Actually I think that's possible to implement it with a refinement on the Range class.

Yup, it works. First time I ever used refinements.

  module JoinRanges
    refine Range do
      def +(other)
        self.to_a + other.to_a
      end
    end
  end

  using JoinRanges
  one = (1..3)
  seven = (7..10)
  (one + seven).each {|n| puts n}


This is different. You are concatenating the arrays, whereas the article & discussion are about zipping arrays.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: