DEC invented Sixels back in the ’80s, and they were serious about their docs, so the corresponding chapter of the VT3xx manual[1] is probably as good as it gets.
> I.e. how large are the pixels [...].
Historical implementations likely assume the relation between pixels and character cells that’s implied by the geometry of the DEC fonts. I’ve seen a lot of arguing about adapting this to the modern world, but I don’t know if a consensus has emerged.
No, there is an escape code that queries the window size in pixels:
"\x1b[14t"
Combined with the escape code that queries the window size in character cells ("\x1b[18t"), you can calculate the number of pixels per character cell (the "pixel size").
Are these escape codes actually implemented in the average terminal? I recently tried to get e.g. alacritty to tell me this stuff but I don't even know how you're supposed to red back the response.
You just send the particular query (e.g. ‘CSI 14 t’) and the terminal sends back a response in the defined form¹. Of course you'll want raw mode, echo off, etc. Normally a library like curses does this for you. If you want to see, https://gist.github.com/kpschoedel/6a87ec2157ce2140be69193d1... (I just whipped this up to answer the question; don't expect production quality)
Most implementations I've seen use an ioctl to query those particular bits. That's implemented quite reliably, since the same ioctl is used for character size as window size. Some implementations just set the character size to zero though.
With a freshly launched terminal on my machine, I get:
- in XTerm (xterm -xrm "XTerm*decTerminalID: vt340"), "before ", "HI", " after", that is to say exactly what you want, out of the box;
- in Foot, "before ", "HI", newline, some spaces, "after";
- in Contour, "before ", "HI", enough newlines to clear the screen (?..), no spaces (?!..), "after".
OK, sez I, let’s just save the cursor position (DECSC, ESC 7) before the image and restore it (DECRC, ESC 8) afterwards, then skip over it; that is,
DECSC=$ESC'7'; DECRC=$ESC'8' # add to definitions
printf "before $DECSC$SIXEL$DECRC$CUF after$LF" # change format string
In XTerm, this (rightly) makes no difference. In Foot and Contour however, you still end up a line resp. a screen below where you started, if now with the correct horizontal position.
So it seems to me like what you want should work by default, except it doesn’t.
It should be possible to instead just treat the whole thing as a framebuffer overlay (by computing or directly asking for the character cell size, as Kirill Panov rightly admonishes me is possible with XTWINOPS) without touching the cursor; that’s what the “sixel scrolling” setting (DECSDM) is supposed to do. Then you can just manually move the cursor forward however many positions after you’re done drawing.
Except apparently the DEC manual (the VT330/340 one above) and DEC hardware contradict each other as to which setting of DECSDM (set or reset) corresponds to which scrolling state (enabled or disabled), and XTerm has implemented it according to the manual not the VT3xx[1,2,3]—then most other emulators followed suit[4]—then XTerm switched to following the hardware[5,6] (unless you and that’s what I’m seeing on my machine right now. So now you need to check if you’re on XTerm ≥ 369 or not[7]. And also for other terminals’ versions, because apparently that’s a thing now[8,9].
Again, ouch.
P.S. DEC had an internal doc for how their terminals should operate (DEC STD 070) [10]. It does not document DECSDM at all.
Nice. I wish I'd had that years ago when the maintainer of a then-popular virtual terminal got very angry at me for suggesting that DECCOLM (set 80/132 columns) should not change the number of lines.
It's interesting to read the discussion about Sixel support in Kitty [1], where the pros and cons of Sixel are considered in relationship with Kitty. In particular, I find this comment [2] by the maintainer of libsixel particularly intriguing:
> After I took over the maintainership of libsixel I unfortunately decided it cannot support the security demands of Kitty, it is too insecure internally. I need to write a Rust library or something.
My apologies — I dislike seeing unexplained acronyms myself. As detaro answered before me, it's ‘not invented here’, the tendency to reject existing solutions for a sense of control.
Recently discovered Kitty's graphics protocol (https://sw.kovidgoyal.net/kitty/graphics-protocol/) which has more features or at least more documented ones :)